Deleting a New

Recently I’ve been working on some software to find memory leaks and fix its fragmentation too. I wanted to start with some example of a leak. And next time I will talk about the fragmentations also.

The leak happens when you allocate objects with new [] and delete (scalar) it. If you’re a c++ savvy, you might say “WTF, of course it leaks”. But I want to be honest for a moment and say that except from not calling the destructors of the instances in the array, I really did not except it to leak. After all, it’s a simple memory pointer we are talking about. But I was wrong and thought it’s worth a post to share this with you.

Suppose we got the following code:

class A {
 public:
  A() { }
  ~A() { }
 private:
  int x, y;
};

...
A * a = new A[3];
...
delete a; // BUGBUG

What really happens in the code, how does the constructors and destructors get called? How does the compiler know whether to destroy a single objects, or how many objects to destroy when you’re finished working with the array ?

Well, first of all, I have to warn you that some of it is implementation specific (per compiler). I’m going to focus on Visual Studio though.

The first thing to understand is that the compiler knows which object to construct and destroy. All this information is available in compilation time (in our case since it’s all static). But if the object was dynamic, the compiler would have called the destructor dynamically, but I don’t care about that case now…

So allocating memory for the objects, it will eventually do malloc(3 * sizeof(A)) and return a pointer to assign in the variable ‘a’. Here’s the catch, the compiler can’t know how many destructors to call when it wants to delete the array, right? It has to bookkeep the count somehow. But Where??
Therefore the call to the memory allocation has more to it. The way MSVC does it is as following (some pseudo code):

int* tmpA = (int*)malloc(sizeof(A) * 3 + sizeof(int)); // Notice the extra space for the integer!
*tmpA = 3; // This is where it stores the count of instances! ta da
A* a = (A*)(tmpA + 1); // Skip that integer, really assigns the pointer allocated + 4 bytes in 'a'.

Now all it has to do is calling the constructors on the array for each entry, easy.
When you work with ‘a’ the implementation is hidden to you, eventually you get a pointer you should only use for accessing the array and nothing else.
At the end you’re supposed to delete the array. The right way to delete this array is to do ‘delete []a;’. The compiler then understands you ask to delete a number of instances rather than a single instance. So it will loop on all the entries and call a destructor for each instance, and at last free the whole memory block. One of the questions I asked in the beginning is how would the compiler know how many objects to destroy? We already know the answer by now. Simple, it stored the count before the pointer you hold.

So deleting the array in a correct manner (and reading the counter) would be as easy as:

int* tmpA = (int*)a - 1; // Notice we take the pointer the compiler handed to you, and get the 'count' from it.
for (int i = 0; i < *tmpA; i++) a[i].~a();
free (tmpA); // Notice we call free() with the original pointer that got allocated!

So far so good, right? But the leak happens if you call a scalar delete on the pointer you get from allocating a new array. And that's the problem even if you have an array of primitive types (like integers, chars, etc) that don't require to call a destructor you still leak memory. Why's that?
Since the new array, as we saw in this implementation returns you a pointer, which does not point to the beginning of the allocated block. And then eventually you call delete upon it, will make the memory manager not find the start of the allocated block (cause you feed it with an offset into the allocated block) and then it has a few options. Either ignore your call, and leak that block. Crash your application or give you a notification, aka Debug mode. Or maybe in extreme cases cause a security breach...

In some forum I read that there are many unexpected behaviors in our case, one of them made me laugh so hard, I feel I need to share it with you:
"A* a = new A[3]; delete a; Might get the first object destroyed and released, but keep the rest in memory".

Well it doesn't take a genius to understand that the compiler prefers to bulk allocate all objects in the same block...and yet, funny.
The point the guy tries to make is that you cannot know what the compiler implementation is, as weird as it might be, don't ever rely on it. And I totally agree.

So in our case a leak happens in the following code:
(wrong:)

int*a = new int[100];
...
delete a;

The point is that when you new[], you ~~should~~ must call a corresponding delete [].
Except from the need to make your code readable and correct, it won't be broken, and never trust the compiler, just code right in the first place.

And now you can imagine what happens if you alloc a single object and tries to delete[] it. Not healthy, to say the least.

Tags: Memory Leaks, new delete c++

This entry was posted on Sunday, October 24th, 2010 at 8:23 am and is filed under C++, Reversing. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Insanely Low-Level

Deleting a New

Leave a Reply