Arkon Under the Woods

May 2nd, 2009

Yeah, I am not alright, I am even an ass, leaving you all (my dear readers, are there any left?) without saying a word for the second time. At the end of last year I was in South East Asia for 3 months, and now I am in South America for 3 months and counting… It is just that I really wish to keep this blog totally technological, but I guess we are all human after all. So yeah, I have been trekking alot in Chile and Argentina (mostly in Patagonia) and having a great time here, now in Buenos Aires. Good steaks and wines, ohhh and the girls. Say no more.

It is really cool that almost every shitty hostel you go, you will find a WiFi available for free use. So carrying an Ipod touch with me I can actually be online, but apparently not many web developers think about Mobile web pages and thus I couldnt write blog posts with Safari, because there is some problem with the text area object. For some reason, I guess some JS code, doesnt run well on the Ipod and I dont get that keyboard thingy up and cannot type in anything, wordpress…

I am always surprised again to see how many computers here, in coffee-shops or just Internet shops, are not really secured. You run as admin some of the times. And there are not anti virus, which I think are good for the average users. And if you plug in your camera to upload some pictures, the next time you will see some stupid new files on it, named: desktop.ini and autorun.inf, sounds familiar? And then I read some MS blog post about disabling AutoRun for removable storage devices..yipi, about time. What I am also trying to say, that one can easily create a zombies army so easily with all those computers… the ease of access and no protection drives me mad.

Anyhow, I had some free time, of course, I am on a vacation, sort of, after all. And I accidentally reached some amazing blog that I couldnt stopped reading for a few days. Meet NO EXECUTE! If you are low level freaks like me, you will really like it too, although Darek begins with hardware stuff, which will fill some gaps for most people I believe, he talks about virtualizations and emulators (his obsession), and I just read it like some fantasies book, eager to get already to the next chapter everytime. I learnt tons of stuff there, and I really like to see that even today some few people still measure and check optimizations in cycles per instructions rather than seconds or MS. One of the stuff I really liked there was a trick he pulled when the guest OS runs on little endian, for instance, and the host OS runs on big endian. Thus every access to memory has to be swapped when the size of the access is more than 2 bytes, of course. Therefore, in order to eliminate the byte swaps, which is expensive, he kinda turned all the memory of the guest OS upside down, and therefore the endianity changed as well. Now it might sound as a simple matter, but this is awesome, and the way he describes it, you can really feel the excitment behind the invention… He also talks about how lame Intel and AMD are to come up with new instruction sets every Monday, which I already mentioned also in the past.

Regarding diStorm now, I decided that I will discontinue the development of the current diStorm64 version. But hey, dont worry. I am going to open source diStorm3 and I still consider making it dual licensed. The benefits of diStorm3 are structure output, and believe me, the speed is amazing and like the good old days, the structure per instruction is unbelieable tiny in size (relative to other disassemblers I saw out there), and you guys are gonna like it.

Thing is, I have no idea when I am getting home…Now with this Swine Flu spreading like hell, I dont know where I will end up. The only great thing about this Swine Flu, so to speak, is that you can see the Evolution in Progress.

Salud

VML + ANI ZERT Patches

February 3rd, 2009

It is time to release an old presentation about the VML and ANI vulnerabilities that were patched by ZERT. It explains the vulnerabilities and how they were closed. It is somewhat very technical, Assembly is required if you wanna really enjoy it. I also gave a talk using this presentation in CCC 2007. It so happened that I wrote the patches, with the extensive help of the team, of course.

ZERT Patches.ppt

Oh No, My XPSP3

February 2nd, 2009

#include <windows.h>
int main()
{
 WCHAR c[1000] = {0};
 memset(c, ‘c’, 1000);
 SystemParametersInfo(SPI_SETDESKWALLPAPER, 0, (PVOID)c, 0);

 WCHAR b[1000] = {0};
 SystemParametersInfo(SPI_GETDESKWALLPAPER, 1000, (PVOID)b, 0);
 return 0;
}

Two posts ago I talked about vulnerabilities. So here’s some Zero Day. This will crash your system, unless you’re on Vista (which is already immune to it). And why the heck on SP3 we are still having this thing not closed yet?

It might be exploitable, I didn’t research it any further than the BSOD of the security cookie…Maybe on some compilations without /GS it can be easily exploited. Or maybe overriding enough of the stack to trigger an exception could be it.

“Remember to let her into your heart,
Then you can start to make it better” - The Beatles.

Escape

February 1st, 2009

Wanted to share this with the world:

e 0:0 cc
e 100 c4 c4 54 27

Can’t Stand it When…

January 31st, 2009

1) … when people say they write code in Assembler. Now, if that sentence didn’t vibe you, then probably you shouldn’t read any futher. It’s like I will tell someone that I know to code in Compiler. And that’s wrong, you don’t code in compiler, you use a compiler in order to compile your code in whatever language you really write in. So the proper word would be “Assembly”. And I encounter too many people, who knows some Assembly too, that say it incorrectly and it freaks me out. The next thing I reply is “you write in compiler, ohhh wow, very nice”, but they don’t get it.

2) … when you think you’re cool and you don’t use goto’s because most people think it’s a bad habit and yet you do it indirectly and you are cooler now. I will just show some code snippet and say no more than - your code should be readable, not making you a cool haxor guy (well maybe that too), and using goto for cleaning resources is legitimate !!!!

status = success;

do {
  p = (char*)malloc(1000);
  if (p == NULL) {
  status = fail;
   break; // <— oh yeah biatch.
  }
 } while (FALSE); // <— oh no, so lame.
 if (status != success) {
  if (p) free(p);
  if (bla) free(bla);
  return status;
 }

 status = do_more_stuff(…);
 return status;
}

3) … when something wrong happens internally in some function and you don’t bubble up the return code up to the caller and you pretend “business as usual” when something is seriously wrong. Then some guy like me needs to come in and debug the flow control to find out what went wrong.

4) … when you cannot disassemble any address you want in Visual Studio debugger (under Platform Builder) and you need to change the PC (IP on ARM) to whatever value and go to “Show Current Statement” and only then set a breakpoint there and view the Assembly code and then fix back the PC to the original’s value.

Got some more? Share them with us.

NULL, Vulnerabilities and Fuzzing

December 31st, 2008

I remember seeing Ilja at BH07. We talked about Kernel attacks, aka privilege escalation. He told me, also, back then, that he found some holes that he managed to execute code through. I think the platform of target was Windows, although Ilja is specializing in Unix. Back in ‘05 already he had a talk about Unix Kernel Auditing. Nothing new probably there, at least for the time being. However, the new approach of fuzzing the kernel, the system calls to be accurate, was pretty new. But feel free to correct me if I’m wrong about it. And it seems Ilja managed to find some holes using fuzzing. (BTW, a much more interesting paper from him about Unusual Bugs.)

Personally, I don’t believe in fuzzing. Usually the holes I find - there is no way a fuzzer will find. Although, I do believe that you need to mix tools/knowledge in order to find holes and audit a software in a better way. It is enough that there is a simple validation of some parameter you pass to a specific potential-hole’y function and all your test can be thrown away because of that validation, though, there is still a weakness in that function, you won’t get to it. Then you say “Ah Uh”, and you think that you can refine the randomness of the parameters you pass to that function and hopefully prevail. Well, it might work, it might not. As I said, I’m not a big fan of fuzzing.

Although, it might be cool to have a tool that analyzes the code of a function and builds the parameters in a special way to make a code coverage of 100% on that function, which is not fuzzing anymore and means: you walk all paths of execution and the chances to find a weakness are so much greater. Writing such a tool is crazyness, and yet possible, if you ask me.

Fuzzing or not, there are still weaknesses in Win32k, which supposed to be one of the most “secured”/audited components in the Kernel. Probably because many researches had their work on it as well. And that’s simply sad.

Speaking about Ilja’s fuzzing of kernel and stuff, and thinking we are cool to find weaknesses nowadays, Mark Russinovich wrote NTCrash back in ‘96 for god sake and, it was a Fuzzer(!), but back then nobody called it or knew about fuzzers. And NTCrash as simple as it is, found some weaknesses in kernel system calls of NT4 ;) Respect (though today it won’t even scratch the kernel, so we might think ourselves cool for still finding stuff :) ).

A friend and I are trying to audit another application, and my friend found some NULL dereference which crashes that software. So we fired up Olly and tried to see what’s going on. It seems that some interface is queried and returns a successful code value and at the same time we get NULL for that interface, which means something is really f*cked up there. Thing is, as you probably can imagine for yourself, we want to execute code out of it. But odds seem to be against us at this time, since we can’t control that NULL or anything about it.

I then wanted to see what people have done with NULL before, how to exploit it better. And usually 99% of the applications running out there don’t have page 0 mapped to their address space. But CSRSS and NTVDM for instance, do have it mapped, but who cares now…? It doesn’t help our cause. Besides, you probably can’t control that page 0 and its data anyway. So I encountered that Flash Exploitation. To be honest, I didn’t read all of the white paper about the exploitation, I only looked for how the arbitary data write worked. And it seems that some CALLOC had failed to allocate memory because of an integer overflow weakness and from there you got a NULL pointer to begin with. But Flash didn’t access that pointer immediately - it had some pointer arithmetic added to it. And you guessed it right, you can control some offset before the pointer is really accessed, thus you can write (almost) anywhere you want. Now I really don’t underestimate the exploitation, from the bits I read it is a crazy and very beautiful exploitation. But to say that it is a new technique and a new class of exploitation is one thing that I really don’t agree to. You know what, looking at it in a different light - it was probably not leading to a code execution if that CALLOC not returned NULL, because then you won’t know where you are on the heap and you couldn’t really write to anywhere you knew accurately. And besides, the NULL wasn’t dererferenced directly and an offset was added to it (no matter what the calculation was for the sake of conversation), so therefore I don’t see it so exciting if you ask me (again, not the exploitation but the “new class of exploitation”). Still you should check it out :)

So, as I saw that no one did anything really useful with a real NULL dereference, it seems that the weakness he found is only a DoS, but maybe we can control something there, yet to be researched…

String Initialization is Tricksy

December 29th, 2008

A friend of mine had to hand in an assignment for Computer Science in university. As I understood, it was a relatively easy assignment. And the point is that that friend is very experienced programmer and knows a thing or two about it. Anyway, he had a line in the code which goes like this:
char buf[1024] = “abc”;

You don’t even need to know C in order to understand that line, right? I assume we all agree to that. It simply initializes the buffer with a constant string literal. So his lecturer asked him, what does this line do precisely. And to his surprise his answer was incorrect. The correct answer is that the whole buffer is initialized and then the string constant is copied (this can be done in a few ways, for example copying a buffer with the zeros at the end of it). So today another friend called me on the phone to ask about this thing, why our first friend was wrong about it. Now as a reverser, I suppose I need to know the answer to such a simple matter as well. But, the sad part was that I was wrong as well as the two of them. I fired up the C standard and started to search for the solution. I wanted al iving proof to the matter at hand. Looking here and there it took me around 15 mins to lie my hands on the piece of sentence that settled all that matter down. And I quote:

“If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.”

The underlined text is the answer - If there are less characters than the size of the array to initialize, the remainder has to be initialized as well. There is another clause which explains how the initialization is being done, but for now, let it be ‘zeroing’.

Now the reason I was wrong about it is because I happened to see many (for example):
mov [buf+1], ‘a’
mov [buf+2], ‘b’
mov [buf+3], ‘c’
mov [buf+4], ‘\0′

in lots of functions, and that means the source C code is:

char buf[] = “abc”;

The standard says about this case that the size of the buffer is to be acquired from the size of the literal constant string, don’t forget the null termination character as well. So that’s why I didn’t see the memset coming in to initialize the all buffer. Besides, maybe most of the people code it this way:
char buf[1024];
strcpy(buf, “abc”);

Which doesn’t lead to a memset or other way of initialization of the rest of the array.

Instructions’ Prefixes Hell

December 21st, 2008

Since the first day diStorm was out people didn’t know how to deal with the fact that I drop(ignore) some prefixes. It seems that dropping unused prefixes isn’t such a great feature for many people and it only complicates the scanning of streams. Therefore I am thinking about removing the whole mechanism, or maybe change it in a way that still preserves the same interface but behaves differently.

For the following stream: “67 50″, the result by diStorm will be: “db 0×67″ - “push eax”. The 0×67 prefix supposes to change the address size, which none is used in our case, thus it’s dropped. However, if we look at the hex code of the “push eax” part we will see “67 50″. And this is where most of the people become dumbfounded. Getting twice the same prefix-byte of the stream in two results is in a way confusing. Taking a look at other disassemblers will tell you that diStorm is not the only one to do such games with prefixes. Sometimes I get emails regarding this “impossible” prefix - since it gets to be output twice, which is wrong, right? Well, don’t know, it depends how you choose to decode it. The way I chose to decode prefixes was really advanced, each prefix could have been ignored, unless it has really affected (one of) the operand itself. I had to really keep tracking on each prefix and know whether it affected any operands in the instructions and only then I examined which prefixes I drop or not. This all sounds right in a way. Hey, at least for me.

However, we didn’t even talk about what you will do if you have multiple prefixes of the same family (segment-overide: DS, ES, SS, etc). Now this one is really up to interpretations of the designer. Probably the way I did it in diStorm is wrong, I admit it, that’s why I want to rewrite the whole prefixes thing from the beginning. There are 4 or 5 types of prefixes and according to the specs (Intel/AMD) I quote: “A single instruction should include a maximum of one prefix from each of the five groups.” …. “The result of using multiple prefixes from a single group is unpredictable.”. This pretty much sums all the problems in the world related to prefixes. I guess you can see for yourself from these 2 lines you can actually treat them in many different ways. We know now that it can lead to “unpredictable” results if you have many prefixes - in reality it won’t shut down your CPU, it won’t even throw an exception. So screw it you say, and you’re right. Now let’s see some CPU (16 bits) logic for decoding the prefixes:

while (prefix byte is read) {
 switch (prefix): {
  case seg_cs: use_seg = cs; break;
  case seg_ds: use_seg = ds; break;
  case seg_ss: use_seg = ss; break;
  ….
  ….
 case op_size: op_size = 32; break;
  case op_addr: op_addr = 32; break;
 case rep_z: rep = z; break;
 …
 }
 - skip byte in stream -
}

The processor will use those flags in order to know which prefix was presented or not. The thing about using a loop (in any form) is that now that you have to show text out of some streams with many prefixes, you don’t know whether the processor really uses the first occurrance of the prefix or its last, or maybe both? And maybe Intel and AMD implement it differently?

You know what? Why the heck do I bother so much with some minor end cases that never really happen in real code sections. I ask myself too, maybe I shouldn’t. Although I happened to see for myself some malware code that tries to screw up the disassembler with many extra prefixes, etc.. and I thought diStorm could help malware analyzers as well with advanced prefixes decoding.

Anyways, according to the above logic code I’m supposed to use the last prefix of each type. Given a stream such as: 66 66 67 67 40. I will get:
0: 66 (dropped)
2: 67 (dropped)
1: 66 67 40
Now you can see that the prefixes used are the second and the fourth and that the instruction starts at the second byte on the stream. Now I officially can commit a suicide, even I can’t follow these addresses, it’s hell. So any better solution?

Welcome Back

December 20th, 2008

Hey you guys again, I’m back from South East Asia after 3 months of traveling all around. Was awesome :)

So here’s some potentially cool real story: What happened is that while I was walking with a few friends in Vietnam (Nha Trang to be accurate) on the beach a friend found a pouch with credit cards and driving license, etc. The only thing we knew about that pouch was the owner’s name and that she was Irish. That didn’t really help us to get to her, unforetunately no cellphone number was attached anywhere in the pouch. The next thing we thought was to look her up on FaceBook, but she wasn’t listed (who doesn’t have FB nowadays? :) ). So, we had to give it to the Vietnamese local police station, but probably that poor girl continued traveling and didn’t find it…

 Anyways, I just realized something very nice, suppose you have somebody’s email. Whether someone left a comment with only his email on this blog, or whatever. And you wish to find that email or who he/she is. So usually we fire up google and looking for that email and we can learn much from that. But sometimes we can’t find anything. And besides, even if we do find something, it might not be relevant or enough information about that person. What I realized was that you can search people using their email in FaceBook, and I really managed to find a few people who were anonymous except their emails, which is quite interesting….Finally we got some way to link a person with an email address, think about it.

So that’s it, I’m back for a couple of months, hopefully I will write some interesting posts, need to get ideas, which usually are originated from my work, stay tuned ;)

Software That Uses diStorm

August 24th, 2008

After a few years that diStorm is out, we can already see it used here and there. Although most users are private users rather than commercial, but even commercial applications use diStorm. I guess many people also use it internally in their companies, but without their word I can’t really know about it. Except some friends who tell me so.

It’s pretty cool that you write something useful which people actually use, and to save commercial use. That was my main reason to release diStorm under the permissive BSD license. The problem arises when there is some commercial applications which don’t give credit for your work, it’s really frustrating and I guess one can’t do much about it. There is this Vietnamic BKV Pro anti virus software that claimed to be written by professors and students (or the like), so I didn’t really expect no credit from such people. But this is our world :( I got an email from an advocate about diStorm’s copyright infrigement. It seems they also abuse WinRAR’s license, so I’m not the only one.. To be honest, I prefer they stop using diStorm immediately rather than not giving me my credit. There are other disassembler libraries out there, they could use them as well. On the other hand, I’m happy to know they use diStorm, but I only ask for recognition, nothing else, after all the hard work I put there. I emailed them but to no response. This licenses’ violation from the AV guys seem to make a lot of noise in Vietnam blogs and forums, though I can’t really understand anything, except where they quote diStorm’s license or saying my name. I haven’t yet contacted OSI, and I’m not sure if they can really help, but it’s worth the try.

Anyway, there are good people who does give credit and I decided it’s about time I will show a small list of users. The first one though goes to a good friend I met through diStorm, who reported many bugs and helped in testing the 64bits environment (do not confuse with AMD64) support, Sanjay Patel. He works(/founder) at RotateRight.com which released last month their Zoom product, which is a very smart Profiler, currently only for Linux though. The product is free for 30 days trial version, you should check it out, it seems to be very promising, because I know more guys behind this product, although I haven’t tested it myself. But hey, it uses diStorm :)

More products which use diStorm:

Apple Shark Profiler

SolidShield - server side protector

DFSee - Low Level disk tools

And some open source projects:

Python-ptrace

Crypto Implementations Analysis Toolkit

Well, that’s what I’m aware about at least, I believe there are more though.

Have fun :)