A C Tidbit

Yeah well, the title is a bit weird… So I’ve been reading the C standard for fun (get a girl?) and I was interested in two things actually. First of all, while I was reading it a friend asked me a question if the layout of local (auto) variables in the scope of a function have any special order. The short answer is no. And I won’t extend this one much. Although nothing is mentioned about the “stack” and how variables are supposed to be in memory in this case. Now this is a bit confusing because in structures (and unions) what you define is how they lie in memory. But this only sounds normal because if you need to read something from hardware in a way, you really care about the order of the members of the structure… Though as it happened to me while I was reversing I saw lots of variables-recycling. Means the compiler uses a variable (which is in the stack) and afterwards went it’s out of scope it re-uses the same memory place for another variable in the same function. Or it might be the coder sucked so much that he used the same variable a few times…:P which I doubt since for once the dword was used as a byte. So the only thing you know about the stack of the function is its layout according to that same function’s code of that specific version and only after compilation (assembly listing is good too).

The other thing I was curious about is the pointer arithmetic that is mixed with integer expressions:

char *p = 0;
p + 5

We all know that it will evaluate to 5. But what about:

long *p = 0;
p + 5 ?

This one will evalute to 20, 5 * sizeof(type)…

Actually the compiler translates the operator [], array subscript, such that p[i] = *(p + i). Nothing is special here as well, unless you didn’t mess with C for a long while… Now it becomes really powerful when you can cast from one pointer type to another and use the indirection operator, so for example, say you want to scan for a dword in a memory block:

long *p = <some input>;
for (int i = 0; i < 1000; i++)
   if (p[i] == value) found…

 But in this case, you can miss that dword since we didn’t scan in byte alignment, so we have two options, either changing the type of pointed p to be a char, or to make a cast… In both ways we need to fix the limit of scan, of course.

So now it becomes:

if (*(dword*) ( ((char*)p) +i) == value) found…

This way p is scanned in byte units, because i is multiplied by sizeof(char). And we take that pointer to char, which is incremented by 1 every iteration and cast it to a dword and then derference that… I think you cannot avoid the use of a cast (or maybe automatic conversion) in the matter of getting this task completed.

Now it might be obvious to most of you, but I doubt it, since I fell in this trap as well:

long a[4] = {0};
printf(“%d”, &a[1] – &a[0]);

What will this code print when executed? I thought 4, as the sizeof of ‘a’ element is a 4 (denoted from the ‘long’ type), but to my surprise ‘1’ was printed; as it will do it as if the subscripts were the operands, however the result will be signed integer. Thus 1 – 0.

The bottom line is that p[i] is *((char*)p + i*sizeof(p[0])) and p[i] – p[j] is ((char*)&p[i] – (char*)&p[j])/sizeof(p[0]). Well the latter is less useful, but if you thought the printf above will return the sizeof the element then you are wrong :)

Leave a Reply