Archive for the ‘COM/AX’ Category

Lib hell (or DLL hell)

Tuesday, August 21st, 2007

In the last few months I got many emails regarding diStorm compilation, usually for different compilers and platforms. So now I’ve been working in the last week to make diStorm compileable for all those compilers. Some of them are in DOS (OpenWatcom/DJGPP/DMC), some of them are for Linux (mostly GCC) for its varying distributions and the rest are for Windows (MSVS, Mingw, TCC, etc…).

The problem begins with the one and only function of the diStorm library, called distorm_decode. This function gets a lot of parameters, the first one is the offset of the code as if it was loaded in the memory (AKA virtual address). For example, you would feed it with 0x100 for .COM binary files (that is, DOS). This very parameter’s type is varying according your compiler and environment, it might be a 32bits integer or, prefereably, a 64bits integer (unsigned long long/__uint64). If you have a compiler that can generate 64bits integer code, then you would prefer it to 32 bits code. So when you disassembler 64bits code you will be able to set any 64bits offset you just like. Otherwise, you will be limited to 32bit offsets.

So I got a special config.h file for diStorm, where there you configure the relevant macros so your compiler will be able to compile diStorm. The config.h is eventually there for all portable macros. There you also set if you would like to use the 64bit offsets or not.

Now the distorm_decode function uses that integer type which is dependent on your decision (which will probably be derived from the support for 64bit integers by your compiler). To simplify the matter, the diStorm interface is a function with the following declaration:

#ifdef SUPPORT64
void distorm_decode(unsigned long long offset);
#else
void distorm_decode(unsigned long offset);
#endif

Now, this function is actually exported by the library, so a caller project will be able to use it. And as you can see, the declaration of this function may change. What I did till now was to have this macro (SUPPORT64) defined twice, once for the library itself when it’s compiled and once for the caller project, so when it uses the library, it will know the size of integers it should use. So far, I didn’t get any problems with it. But then since many compilers (say, for DOS) don’t support 64bit integers, it is a pain in the ass to change these two different files. I couldn’t come up with the same header file for the sake of the caller project, since it might get only a compiled version, and then what? Do trial and error until it finds the correct size of the integer? I can’t allow things to be so loosy.

Things could get really nasty, if the library was compiled with 32bit integers, and then the caller project will use it with 64bit integers. Stack is unbalanced, it will probably crash somewhere when trying to use some pointer…soon or later you will notice it. And then you will have to know to comment out the macro on the caller project.

After I got this complaint from one of the diStorm users, I decided to try and put an end to this lib hell. I tried to think of a different ways. It would be the best way if I had a reflection in C, or if I had a scripting language which in there I could determine the size of the integer in runtime and know how to invoke distorm_decode. Unfortunately, this is not the case. Starting to think of COM out there, or SXS and other not really helpful ways to solve DLL hell. I sense it’s the wrong way. Since it is a library after all, and I don’t have the luxory of running code, I mean, it could be really solved with running code and asking the library (or rather the DLL actually) what interface should I use… reminds me of QueryInterface somewhat. But alas, I can’t do that either, since it’s compile time we’re talking about. So all I got is this lousy idea:

The exported function name will be determined according to the size of the integer it was compiled with. It would look like this:

#ifdef SUPPORT64
void distorm_decode64(unsigned long long offset);
#define distorm_decode distorm_decode64
#else … you get the idea

But we are still stuck with the SUPPORT64 macro for the caller project, which I think there is no option and we will have to stick with it to the bitter end…

Then what did we earn from this change? Well pretty much a lot, let’s see:

1) The actual code of existing projects don’t need to be changed at all, the macro substitution will do the work for it automatically.

2)The application now won’t get to crash, since if you got the wrong integer size, means you won’t find the exported symbol.

3)Now it becomes a linker time error rather than runtime crash/error. So before the caller project will get to fully be linked, you will know to change the SUPPORT64 macro (whether to add it or drop it).

4)The code as it was before could lead you to a crash, since you could manually enable/disable the macro.

Well, I still feel it’s not the best idea out there, but hey, I really don’t want to write some code in the caller project that will determine the interface in runtime and only then know which function to use…it’s simply ugly and not right, because you can do it earlier and cleaner.

I would like to hear how would you do it? I am still not committing my code… so hurry up :)