Common PE-Parsing Pitfalls

PE, or Portable-Executable is Windows’ executable format. Looking only at the PE , as opposed to the code inside, can teach you alot about the application. But sometimes the tools you use to parse the file don’t do their work well. I, hereby, want to show a few problems about this kind of tools. As a matter of fact, .DLL and .SYS are also PE files under Windows, so I consider them the same when talking about PE files.

  1. If the Export-Directory offset points to a garbage pointer, Windows will still load the PE and run it successfully. It will get crazy and probably crash only when you try to GetProcAddress on this file. You can use this to render some tools unuseful but the file is still valid and runnable. Don’t confuse with Import-Directory which is necessary for loading the PE.
  2. Another annoying thing, whether the so-called “PE-Standard” states it or not, is the way of RVA (relative-virtual -address) offsets. RVA offset is the distance from the beginning of the file in memory to some specific field or section. Most people treat these RVA’s as if they must point into a section. But in reality RVA’s can point anywhere in the memory mapped file. They can even be negative numbers, (at least, as long as they still point to valid memory of the file). The problem is, most tools try to find the pointee field/section by scanning the sections list, but alas, it might be a valid RVA, which points to no section, but another part in the memory mapped file, for example, the MZ header… While Windows load these files well, some tools cannot parse them.
  3.  The most interesting problem that I found myself, not sure if anyone used it before, was changing the section’s File-Offset to be mis-aligned. The File-Offset is actually rounded down to a sector size (that’s 512 bytes) no matter what. So adding some number to the original valid File-Offset of the code section will fool some tools to read the code from that offset, instead of the rounded offset. Guess what happens? You disassemble the code from a wrong address and everything messes up. Even the mighty IDA had this bug. I introduced this technique in my Tiny PE Challenge. It seems most anti-virus software couldn’t chew up this file back then when I released it…Not sure about nowadays.
  4.  While researching for Tiny PE, Matthew Murphy hinted out that you can load files from the Internet with feeding it with a raw URL of the PE file. Later on it was extended such that Windows’ loader will use WebDAV to load an imported .DLL from the Internet! Think of an imported library with the following name \\127.0.0.1\my.dll inside the PE file itself. This one seemed to be a real blow to the AV industry. It means you can write an empty PE file which will have this special import thingy and gets it off the Internet. For samples you can check it out here, which covers Tiny PE (not my ones) very nicely.

The bottom line is that the Windows’ loader is kinda robust and yet very permissive. It seems as virii can exploit many features the PE format has to offer while AV’s still don’t support some. I guess some of the tools (some are really popular) will get better with time. As for now, my PE parser library for Python, diSlib64 seems to do the job quite well.

Leave a Reply