Python « Insanely Low-Level

Archive for the ‘Python’ Category

Executing .PYC Files in Python

Friday, April 29th, 2011

I got this .PYC (compiled Python script) file that I used with command line Python, ala: “python.exe script.pyc”. And that would run the script.pyc file and let it do its job. The problem was that I wanted to run a few Python lines before running script.pyc itself. But apparently, it’s not really possible.

Suppose script.pyc does the familiar name check:
if __name__ == “__main__”:
main()

I guess everyone who wrote a module or two in Python knows this trick. This way you can tell in your script whether it was imported or executed and do whatever you are up to correspondingly. Thus, if it was executed, you will probably want to run a test case for the module, hence calling main() usually, and you could even pass command line arguments to it the normal way.

Since I wanted to do a few things before the script gets to run in Python, it means I had to open Python myself, do my stuff, then execute the script. Unfortunately, it is not possible to execfile() a .PYC file, beat me. Again, I couldn’t just import the file since then that if statement I presented above would fail and won’t call the main() function and nothing would happen, fail. execfile() also doesn’t work, simply because it runs only pure Python source code and not a compiled script.

What I eventually came up with was to import the file myself, but that required fooling a bit :)
You can __import__ file in the code in runtime. What I mean is that when you don’t have the filename to import in static time, you can use that function which receives a filename and dynamically loads the module in runtime.
I tried that on the script.pyc file and obviously it didn’t work as well, because the __name__ was wrong, so the main() didn’t get executed. That made me realize that I need to do the __import__’s internals on my own and only then I will be able to change the __name__ for the module (if that’s possible, but it has to be, since the distinction exists, right?)

Then after a bit of googling around I sumbled upon: imp.
Which shows how to import a file ourselves, then I changed it to:
import imp
fp, pathname, description = imp.find_module(“script”)
imp.load_module(“__main__”, fp, pathname, description)

Notice how I pass __main__ instead of the module’s real name, then the check for main() inside script.py would really work and execute the main() function and I’m all happy once again.

Tags:__name__, compiled python, execfile, import, pyc, python, script
Posted in Python | 3 Comments »

diStorm for Java, JNI

Monday, October 4th, 2010

Since we decided to use Java for the reverse engineering studio, ReviveR, I had to wrap diStorm for Java. For now we decided that the core framework is going to be written in Java, and probably the UI too, although we haven’t concluded that yet. Anyway, now we are thinking about the design of the whole system, and I’m so excited about how things start to look. I will save a whole post to tell you about the design once it’s ready.

I wanted to talk a bit about the JNI, that’s the Java Native Interface. Since diStorm is written in C, I had to use JNI to use it inside Java now. It might remind P/Invoke to people, or Python extensions, etc.

The first thing I had to do is to define the same C structures of diStorm’s API, but in Java. And this time they are classes, encapsulated obviously. After I had this classes ready, and stupid Java, I had to put each public class in a separate file… Eventually I had like 10 files for all definitions and then next step was to compile the whole thing and use the javah tool to get the definitions for the native functions. I didn’t like the way it worked, for instance, any time you rename the package name, add/remove a package the name of the exported C function, of the native .DLL file, changes as well, big lose.
Once I decided on the names of the packages and classes finally I could move on to implement the native C functions that correspond to the native definitions in the Java class that I wrote earlier. If you’re familiar a bit with JNI, you probably know well jobject and its friends. And because I use classes rather than a few primitive type arguments, I had to work hard to wrap them, not mentioning arrays of the instructions I want to return to the caller.

The Java definition looks as such:

public static native void Decompose(CodeInfo ci, DecomposedResult dr);

The corresponding C function looks as such:

JNIEXPORT void JNICALL Java_distorm3_Distorm3_Decompose
  (JNIEnv *env, jobject thiz, jobject jciObj, jobject jdrObj);

Since the method is static, there’s no use for the thiz (equivalent of class’s this) argument. And then the two objects of input and output.
Now, the way we treat the jobjects is dependent on the classes we set in Java. I separated them in such a way that one class, CodeInfo, is used for the input of the disassembler. And the other class, DecomposedResult, is used for output, this one would contain an array to return the instructions that were disassembled.

Since we are now messing with arrays, we don’t need to use another out-argument to indicate the number of entries we returned in the array, right? Because now we can use something like array.length… As opposed to C function: void f(int a[], int n). So I found myself having to change the classes a bit to take this into account, nothing’s special though. Just need to get advantages of high level languages.

Moving on, we have to access the fields of the classes, this is where I got really irritated by the way the JNI works. I wish it were as easy as cTypes for Python, of course they are not parallel exactly, but they solve the same problem after all. Or a different approach like parsing a tuple in Embedded Python, PyArg_ParseTuple, which eases this process so much.

For each field in the class, you need to know both its type and its Id. The type is something you know at compile time, it’s predefined and simply depends on the way you defined your classes in Java, easy. The ugly part now begins, Ids – You have to know to which field you want to access, either for read or write access. The idea behind those Ids was to make the code more flexible, in the way that if you inherit a class, then the field you want to access probably moved to a new slot in the raw structure that contains it.
Think of something like this:

struct A {
int x, y;
};

struct B {
 int z, color;
};

struct AB {
 A;
 B;
};

Suddenly, accessing to AB::B.z has a different index than accessing to B.z. Can you see that?
So they guys who designed JNI came with the idea of querying the class, by using internal reflection to get this Id (or really an index to the variable in the struct, I take a guess). But this reflection thingy is really slow, obviously you need to do string comparisons on all members of the class, and all classes in the derived class… No good. So you might say, “but wait a sec, the class’s hierarchy is not going to change in the lifetime of the application, so why not reuse its value?”. This is where the JNI documentation talks about caching-ids. Now seriously, why don’t you guys do it for us internally, why I need to implement my own caching. Don’t give me this ‘more-control’ bullshit. I don’t want control, I want to access the field as quickly as possible and get on to other tasks.

Well, since the facts are different, and we have to do things the way we do, now we have to cache the stupid Ids for good. While I read how people did it and why they got mysterious crashes, I solved the problem quickly, but I want to elaborate on it.

In order to cache the Ids of the fields you want to have access to, you do the following:

if (g_ID_CodeOffset == NULL) {
    g_ID_CodeOffset = (*env)->GetFieldID(env, jObj, "mCodeOffset", "J");
    // Assume the field exists, otherwise your interfaces are broken anyway.
}
// Now we can use it...

Great right? Well, not so fast. The problem is that if you have a few functions that each accesses this same class and its members, you will need to have this few lines of code everywhere for each use. No go. Therefore the common solution is to have another native static InitIDs function and invoke it right after loading the native library in your Java code, for instance:

static {
	System.loadLibrary("distorm3");
	InitIDs();
}

Another option would be to use the JNI_OnLoad exported function to initialize all global Ids before the rest of the functions get ever called. I like that option more than the InitIDs, which is less artificial in my opinion.

Once we got the Id ready we can use it, for instance:

codeOffset = (*env)->GetLongField(env, jciObj, g_ID_CodeOffset);

Note that I use the C interface of the JNI API, just so you are aware to it. And jciObj is the argument we got from Java calling us in the Decompose function.

When calling the GetField function we have to pass a jclass, that’s a Java-class object’s pointer kinda. In contrast to the class instance, I hope you know the difference. Now since we cache the Ids for the rest of the application life time, we have to keep a reference to this Java-class, otherwise weird problems and nice crashes should (un)surprise you. This is crucial since we use the same Ids for the same classes along the code. So when we call the GetFieldID we should hold a reference to that class, by calling:

(*env)->NewWeakGlobalRef(env, jCls);

Note that jCls was retrieved using:

jCls = (*env)->FindClass(env, "distorm3/CodeInfo");

Of course, don’t forget to remove the reference to those classes you used in your code, by calling DeleteGlobalRef in JNI_OnUnload to avoid leaks…

The FindClass function is very good once you know how to use it. It took me a while to figure out the syntax and naming convention. For example, the String which seems to be a primitive type in Java, is really not, it’s just a normal class, therefore you will have to use “java/lang/String” if you want to access a string member.
Suppose you got a class “CodeInfo” in the “distorm3” package, then “distorm3/CodeInfo” is the package-name/class-name.
Suppose you got an inner class (inside another class), then “distorm3/Outer$Inner” is the package-name/outer-class-name$inner-class-name.
And probably there are a bit more to it, but that’s a good start.

About returning new objects to the caller. We said already that we don’t use out-arguments in Java.
Think of:

void f(int *n)
{
 *n = 5;
}

That’s exactly what an out-argument is, to return some value rather than using the return keyword…
When you want to return lots of info, it’s not a good idea, you will have to pass lots of arguments as well, pretty ugly.
The idea is to pass a structure/class that will hold this information, and even have some encapsulation to it.
The problem at hand is whether to use a constructor of the class, or just create the object and set each of its values manually.
Also, I wonder which method is faster, letting the JVM do it on its own in a constructor, or doing everything using JNI.
Unfortunately I don’t have an answer to this question. I can only say that I used the latter method of creating the raw object and setting its fields. I thought it would be better.
It looks like this:

jobject jOperand = (*env)->AllocObject(env, g_OperandIds.jCls);
if (jOperand == NULL) // Handle error!
(*env)->SetIntField(env, jOperand, g_OperandIds.ID_Type, insts[i].ops[j].type);
(*env)->SetIntField(env, jOperand, g_OperandIds.ID_Index, insts[i].ops[j].index);
(*env)->SetIntField(env, jOperand, g_OperandIds.ID_Size, insts[i].ops[j].size);

(*env)->SetObjectArrayElement(env, jOperands, j, jOperand);

This is real piece of code taken from the wrapper code. It constructs an Operand class from the Operand structure in C. Notice the way the AllocObject is used, using that jCls we hold a reference to, instead of calling FindClass again… Then setting the fields and setting this object in the array of Operands.

What I didn’t like much in the JNI is that I had to call SetField, GetField and those variations. On one hand, I understand they wanted you to know which type of field you access to. But on the other hand, when I queried the Id of the field, I specified its type, so I pretty much know what type-value I’m setting, so… Well, unless you have bugs in your code, but that will always cause problems.

To another issue, one of the members of the CodeInfo structure in diStorm is a pointer to the binary code that you want to disassemble. It means that we have to get some buffer from Java as well. But apparently, sometimes the JVM decided to make a copy of the buffer/array that is being passed to the native function. In the beginning I used a straight forward byte[] member in the class. This sucks hard. We don’t want to waste time on copying and freeing buffers that are read-only. Performance, if can be better, should be better by default, if you ask me. So reading the documentation there’s an extension to the JNI, to use java.nio.ByteBuffer, which gives you a direct access to the Java buffer without the extra efforts of copying. Note that it requires the caller to the native API to use this class specifically and sometimes you’re limited…

The bottom line is that it takes a short while to understand how to use JNI and then you get going with it. I found it cumbersome a bit… The most annoying part is all the extra preparations you have to do in order to access a class or a field, etc. Unless you don’t care at all about performance but then your code is less readable for sure. We don’t have any information about performance of allocating new objects and array usage. We can’t base our ways of coding on anything. I wish it could be more user friendly or parts of it eliminated somehow.

Tags:diStorm, Java, JNI, ReviveR
Posted in diStorm, Optimization, Python, ReviveR | 3 Comments »

diStorm3 is Ready

Monday, August 16th, 2010

diStorm3 is ready for the masses! :)
– if you want to maximize the information you get from a single instruction; Structure output rather than text, flow control analysis support and more!

Check it out now at its new google page.

Good luck!

Tags:disassembler library, Disassembling, disassembly, diStorm, distorm3, distorm64
Posted in Assembly, Code Analysis, diStorm, Python, Reversing, Security, Software | 5 Comments »

Context of Execution (Meet Pyda)

Tuesday, August 5th, 2008

While I was writing an interactive console Python plugin for IDA pro I stumbled in a very buggy problem. The way a plugin works in IDA is that you supply 3 callbacks, init, run, term. You can guess which is called when. I just might put in that there are a few types of plugins and each type can be loaded in different times. I used, let’s say, a UI plugin which gets to run when you hit a hotkey. At that moment my run() function executes in the context of IDA’s thread. And only then I’m allowed to use all other data/code from IDA’s SDK. This is all good so far, and as it should be from such a plugin interface. So what’s wrong? The key idea behind my plugin is that it is an interactive Python console. Means that you have a console window which there you enter commands which eventually will be executed by the Python’s interpreter. Indirectly, this means that I create a thread with a tight loop that waits for user’s input (it doesn’t really matter how it’s being done at the moment). Once the user supplies some input and presses the enter key I get the buffer and run it. Now, it should be clear that the code which Python runs is in that same thread of the console, which I created. Can you see the problem? Ok, maybe not yet.

Some of the commands you can run are wrappers for IDA commands, just like the somewhat embedded IDC scripting in IDA, you have all those functions, but this time in Python. Suppose that you try to access an instruction to get its mnemonic from Python, but this time you do it from a different thread while the unspoken contract with IDA plugin is that whatever you do, you do it in run() function time, and unforetunately this is not the case, and cannot be the case ever, since it is an interactive console that waits for your input and should be ready anytime. If you just try to run SDK commands from the other (console) thread then you experience mysterious crashes and unexplained weird bugs, which is a big no no, and cannot be forgiveable for a product like this, of course.

Now that we know the problem we need to hack a solution. I could have done a very nasty hack, that each time the user runs something in Python, I simulate the hotkey for the plugin and then my plugin’s run() will be called from IDA’s context and there I will look at some global info which will instruct me what to run, and later on I need to marshal back the result to the Python’s (console) thread somehow. This is all possible, but com’on this is fugly.

I read the SDK upside down to find some way to get another callback, which I can somehow trigger whenever I want, in order to run from IDA’s context, but hardly I found something good and at the end nothing worked properly and it was too shakey as well.

Just to note that if you call a function from the created thread, most of the times it works, but when you have some algorithm running heavily, you get crashes after a short while. Even something simple like reading some info of an instruction in a busy loop and at the same time scrolling the view might cause problems.

Then I decided it’s time to do it correctly without any hacks at all. So what I did was reading in MSDN and looking for ways to run my own code at another thread’s context. Alternatively, maybe I could block IDA’s thread while running the functions from my own thread, but it doesn’t sound too clever, and there’s no really easy way to block a thread. And mind you that I wanted a simple and clean solution as much as possible, without crafting some Assembly code, etc. ;)

So one of the things that jumped into my mind was APC, using the function QueueUserAPC, which gets among other params an HTHREAD, which is exactly what I needed. However, the APC will get to run on the target thread only when it is in an alertable state. Which is really bad, because most UI threads are never in that state. So off I went to find another idea. Though this time I felt I got closer to the solution I wanted.

Later on a friend tipped me to check window messages. Apparently if you send a message to another window, which was created by the target thread (if the thread is a UI one and has windows, of course we assume that in IDA’s case and it’s totally ok), the target thread will dispatch that window’s window-procedure in its own context. But this is nothing new for me or you here, I guess. The thing was that I could create a window as a child of IDA’s main window and supply my own window-procedure to that window. And whenever a message is received at this window, its window-procedure gets to run in IDA’s thread’s context! (and rather not the one which created it). BAM mission accomplished. [Oh and by the way, timers (SetTimer) work the same way, since they are implemented as messages after all.]

After implementing this mechanism for getting to run in original’s IDA’s context, just like the contract wants, everything runs perfectly. To be honest, it could easily have not worked because it’s really dependent on the way IDA works and its design, but oh well. :)

About this tool, I am going to release it (as a source code) in the upcoming month. It is called Pyda, as a cool abbreviation for Python and IDA. More information will be available when it is out.

Posted in Debugging, Python, Software, Win32 | No Comments »

Signed Division In Python (and Vial)

Friday, April 25th, 2008

My stupid hosting company decided to move the site to a different server blah blah, the point is that I lost some of the recent DB changes and my email hasn’t been working for a week now :(

Anyways I repost it. The sad truth was that I had to find the post in Google’s cache in order to restore it, way to go.

Friday, April 18th, 2008:

As I was working on Vial to implement the IDIV instruction, I needed to have a signed division operator in Python. And since the x86 is a 2’s complement based, I first have to convert the number into Python’s negative (from unsigned) and only then make the operation, in my case a simple division. It was supposed to be a matter of a few minutes to code this function which gets the two operands of IDIV and return the result, but in practice it took a few bad hours.

The conversion is really easy, say we mess with 8 bits integers, then 0xff is -1, and 0×80 is -128 etc. The equation to convert it to a Python’s negative is: val – (1 << sizeof(val)*8). Of course, you do that only if the most significant bit, sign bit, is set. Eventually you return the result of val1 / val2. So far so good, but no, as I was trying to feed my IDIV with random input numbers, I saw that the result my Python’s code returns is not the same as the processor’s. This was when I started to freak out. Trying to figure out what’s the problem with my very simple snippet of code. And alas, later on I realized nothing was wrong with my code, it’s all Python’s fault.

What’s wrong with Python’s divide operator? Well, to be strict, it does not round the negative result toward 0, but towards negative infinity. Now, to be honest, I’m not really into math stuff, but all x86 processors rounds negative numbers (and positive also to be accurate) toward 0. So one would really assume Python does the same, as would C, for instance. The simple case to show what I mean is: 5/-3, in Python results in -2. Rather than -1, as the x86 IDIV instruction is expected and should return. And besides -(5/3) is not 5/-3 in Python, now it’s the time you say WTF. Which is another annoying point. But again, as I’m not a math guy, though I was speaking with many friends about this behavior, that equality (or to be accurate, inequality) is ok in real world. Seriously, what we, coders, care about real world math now? I just want to simulate a simple instruction. I really wanted to go and shout “hey there’s a bug in Python divide operator” and how come nobody saw it before? But after some digging, this behavior is really documented in Python. As much as I would hate it and many other people I know, that’s that. I even took a look at the source code of the integer division algorithm, and saw a ‘patch’ to fix the numbers to be floored if the result is negative because of C89 doesn’t define the rounding well enough.

While you’re coding something and you have a bug, you usually just start debugging your code and track it down and then fix it easily while keeping on working on the code. Because you’re in the middle of the coding phase. There are those rare times that you really get crazy when you’re absolutely sure your code is supposed to work (which it does not) and then you realize that the layer you should trust is broken (in a way). Really you want kill someone … being a good guy I won’t do that.

Did I hear anyone say modulo?? Oh don’t even bother, but this time I think that Python returns the (math) expected result rather than the CPU. But what does it matter now? I really want only to imitate the processor’s behavior. So I had to hack that one too.

The solution after all, was to make the Python’s negative number to be absolute and remember its original sign, that we do for both operands. And then we make an unsigned division and if the signs of the input are not the same we change the sign of the result. This is because we know that the unsigned division works as the processor does and we can then use it safely.

res = x/y; if (sign_of_x != sign_of_y) res = -res;

The bottom line is that I really hate this behavior in Python and it’s not a bug, after all. I’m not sure how many people like me encountered this issue. But it’s really annoying. I don’t believe they are going to fix it in Python 3, never know though.

Anyway, I got my IDIV working now, and that was the last instruction I had to cover in my unit tests. Now It’s analysis time :)

Posted in Algorithms, Assembly, Debugging, Python | 2 Comments »

Embedded Python And raw_input

Monday, February 25th, 2008

If you ever messed with embedded Python you would know that you can’t use the Win32 Console so easily. For that you will have to hack around a few bits in order to bind the stdin and stdout with the Win32 ones. Another familiar technique is to override sys.stdout with your own class that implements the ‘write’ method. So every time something goes to output this method is really called with an argument which is then being printed on the screen with whatever Win32 API you want, to name WriteFile…

This same technique works also for stderr which is used when one writes some invalid text which doesn’t compile and an exception is thrown.

‘Hooking’ both stdout and stderr we still got stdin out of the loop. Now all these years that embedded Python was around I have never encountered the problem that the stdin doesn’t work with raw_input(). Don’t be confused, the way things work in embedded Python, it doesn’t mean that the interactive interpreter is fed from stdin, it’s a bit more messy. In all examples of embedded Python you have to implement your own interactive interpreter on your own, therefore you bypass stdin from the first moment. Hence if you never used raw_input you will not know if it’s broken or not. So I want to thank a special friend for reporting this bug to me… ;) J

The fix… To fix the problem I tried to replace sys.stdin with my own implementation of a class with a ‘read’ method (by guessing-because ‘write’ is stdout’s). Doing so I found out that the method name should be ‘readline’ according to raw_input’s shouts at me. Then I renamed the method and the implementation was to call a function that I wrote in C and which is wrapped in Python that reads input from the console (Using ReadFile(GetStdHandle(STD_INPUT_HANDLE),…) and returns that input string rawly. That did the trick and now raw_input works again.

I am pretty sure that all embedded Python users out there will need this hack if they already hack stdout…

Posted in Python | No Comments »

Let Me Import

Wednesday, December 19th, 2007

The problem is that you cannot import modules from different (relative) paths in Python. Now sometimes it’s really a must. Like when you have a generic CPU module and then a subdirectory called ‘x86’ that wants to derive that CPU class and create a ‘x86CPU’.

So I tried to mess up with __import__ but without any luck. Then I said I just want to solve this problem, don’t care how far (ugly) I go. I started with execfile, which AFAIK isn’t exactly like import, but it’s good enough for me. execfile(“../cpu.py”) didn’t work, unforetunately. I then realized that the cpu.py file imports another file from its own directory and since the execfile doesn’t do some magic with the directories the import simply failed. Bammer. Next thing I did was to add the path of the CPU module to the sys.path and retried my attempt with a success. Although it works, I don’t like the directory separator which isn’t cross-platform. And yes I said it’s a temporary solution, but I’m still sensitive to it. Usually temporary hacks that work tend to stay there, as much as I don’t like this reproach…this is how things work usually, no?

So that one was solved, quite easily, but another one arised. In the original code (when all files used to be in the same directory) when import worked I always import the module name, and not the lot of it (~~import * from cpu~~ – bad habit). So all my code is accessing the CPU module with ‘CPU.’ prefix, which is good, you know the source of everything you touch. The new problem is that since I moved to use execfile this prefix is ill now and I must get rid off it.

I thought about changing the globals dictionary to the name of the module I want to execfile() and then switch it back. But it becomes too nasty, and I’m not sure whether it would work anyway. My first attempt might be ok for some of you, alas, it’s still not good enough for me.

And I think my design with the files is ok, after all it makes sense. And yes, I know I (maybe?) should have put the files in site-packages directory and the import then would work well. However, I want the files to be in a specific directory out of Python reach (in a way).

Oh by the way, the code of my attempt is:

import os, sys
sys.path.append(os.getcwd()[:os.getcwd().rfind(os.sep)])
execfile(“..” + os.sep + “cpu.py”)

Ugly something, heh?

Ok I was just going to publish this post but I had to try another thing which really solved the issue. I managed to use __import__ well.

So now it looks like:

import os, sys
sys.path.append(os.getcwd()[:os.getcwd().rfind(os.sep)])
cpu = __import__(“cpu”)

This time it solves the second problem as well – the namespace (of the module) issue. And yet we need the new path. Can we get rid off it?

Posted in Python | 2 Comments »

Python: Converting Signedness

Friday, November 30th, 2007

Many times I encounter Python modules that were written in C/C++ which when you use them you get to a really annoying problem. Integers that that module returns are all unsigned. That’s really a problem, because Python longs are (potentially) inifinite (they are digit-strings) AFAIK, unlike C/C++ integers that have a limit of, usually, 32 or 64 bits.

If you pass back -1 to Python and format it as unsigned long, you will get 2**32-1. Now suppose you really want to treat the number inside your Python code as signed integer, you will have to convert 2**32-1 to -1 somehow. Most solutions I saw to this problem was to use struct in the following manner:

import struct
struct.unpack(“l”, struct.pack(“>L”, 2**32-1))[0]

Packs an unsigned long value of 0xffffffff to (signed) long -1 (using little endian, but I don’t care now about it).

You might want to use unsigned long long – that’s 64 bits integers in order to convert a bigger number. So you can do one of two things, convert your 32 bits integer to 64 bits integer by sign extending (and that’s another whole challenge) it and stuff it into unpack/pack of 64 bits, or test the size of the integer (by how many bits it takes) and call the correct unpack/pack pair.

It was then that I realized why I shouldn’t use this ugly trick. It simply doesn’t support Python’s longs. As I said earlier they are infinite and using this trick you are capped to 64 bits. So I thought of a better way, and not using any stdlib module at all, leaving it pure Python code…

The way we know to change the signedness of an integer is by negating it, which is NOTing all bits of that number and incrementing it by 1, right? (2’s complement) Well true and that should work:

(Say we work with 8 bits integers)

0xfb = -5

>>> ~0xfb
-252
>>> ~0xfb+1
-251
>>> 256-251
5
>>> 0-5
-5

The way it works is: -(256 + (~x+1)) where x is a byte. For every integer we need to scan for its most significant bit… We can do it the other way around with the following formula (x – NEXT_MSB(x)):

>>> 0xfb – 0x100
-5

This way it’s like we changed the sign bit of the integer and fixed the number as well. Both formulas can work for all integers’ sizes. But the key here is to find the MSB. I prefered to stick to the latter formula rather than the former, since it seems to be shorter. But it doesn’t really matter, both work.

So now we have to code it, and as you should know me already – in one-liner device! The first challenge is to find the MSB, something like this should suffice in C(!):

for (int i = 7; i >= 0; i–)
if (x & (1 << i)) break;

This will work for a byte integer, and note that we must start from the end towards the start. Otherwise we won’t find the MSB but the LSB. The problem in Python is that we don’t know the size of the longs we mess with and we need to come up with a trick to find its MSB.

The lame trick I used for converting a number into decimal ASCII string, will be used here too and it goes like this:

for i in xrange(2**32):
if (i / (1 << i)):
break

We try to divide the input number by 2, 4, 8, 16, 32, … and when the result is 0, we know that we are out of bits. I said it’s lame because we use division, which is slow. If you got any other idea write to me please.

Another drawback is the limit of the numbers we scan, we are limited to 2**32, this is huge enough and I guess you will never reach that, or I will be dead first prolly :o. Using Erez’s trick (see here), we can make it a bit more elegant and stop as soon as the MSB was found.

I am not sure whether you noticed, but supplying an input of a negative number isn’t a smart move, we will have to check for it specifically. Eventually this is the code I came up with:

(Note that the input “bits-stream” can be any size)

def signed(n):
return n if n < 0 else n – [i for i in (2**j if n/(2**(j-1)) else iter(()).next() for j in xrange(2**31-1))][-1]

>>> signed(0xb)
-5
>>> signed(0xfb)
-5
>>> signed(0xffffb)
-5
>>> signed(0xffffffffffffffffffffffffb)
-5L

GZ-LN

Posted in Algorithms, Python | 3 Comments »

Challenge: One-Liner For Converting a Decimal

Thursday, October 25th, 2007

Or – a one-liner device to convert a decimal number to a string, using any base, which is lower than 10. If you want to use bases which are above 10, you will have to construct a table somehow that goes from ‘0’ to ‘9’ and then continues from ‘a’ to the required base, (or you can use a static table). So suppose we are dealing with a base <= 10, we only need to convert it to ascii, so it’s pretty simple.

If you didn’t figure it out until now (and how could you?) I’m talking about Python here. There is this int() function (actually it’s a class type to be more accurate, its constructor), which converts any string to a decimal number. Say, int(‘101’, 2) will result in 5. But the opposite operation is no where to be seen.

The straight forward way is easy:

while(n > 0):
l.append(n%BASE)
n /= BASE
“”.join(map(str, l[::-1]))

Though, it’s an ugly way, just to show the principle. We can do it with recursion, and then we don’t need to reverse the result, by a side effect of recursion.

When I decided to write the conversion function just for the fun of it, I wanted it to not use recursion…because with recursion it’s really easy. :) So why to make our life simple when we do things for learning and sport? Besides, for some people recursion is less intuitive, althought we might argue abou it.

So here’s my first version:

“”.join([str((n/i)%b) for i in [b**j for j in xrange(31, -1, -1)]]).lstrip(‘0’)

At the beginning I use chr((n/i)%b + 0x30), because I’m used to deal with char arrays and thinking old school C code. So Kasperle came up with the str thingy, which is much better for code readability.

Anyway, I really got pissed with the idea that I have to drop all leading zero, otherwise for n=5, I will get an input of ‘00000000000000000000000000001110’, which is quite cumbersome.

One drawback is the size of integer we want to convert, as you probably guessed, this code supports 32 bit numbers, it might support any number in a jiffy… But then you will probably have to strip more zeros most of the times. ;( Enough fooling around.

What I’m really trying to achieve is to use the code to convert any sized number, without the need of any constant magic value in my one-liner.

So trying to come up with the accurate number of digits to convert in the first place is the really the bugging trick. What we really need is something like math.log. Using the log we can know the number of digits at once. But then we need to import math. Do we count ‘import’s when we say one-liner or not? Well, I will take it as No. Hardening my life without math.

“”.join([str((n/i)%b) for i in [b**j for j in xrange(math.log(n, b), -1, -1)]])

I could have used the input number for the xrange, but it won’t return ‘0’ for an input zero number. And even so, it’s kidna cheating and lame.

Technically, the solution is to generate a list with [1, 10, 100, 1000, ….]. The condition to stop is when n/entry == 0. The problem to make this list is how to generate it on the fly? :) or how to stop generating it.

Well, AFAIK in Python it’s not possible. So I’m trying to simulate log. Imri just suggested to use a rate number for a log approximation which will be base dependent. But I didn’t like that idea – magic numbers, remember? And maybe even losing precision.

By now, Kasperle, who was the recursion guy, lost his patience with my stupid challenge. Imri is trying to calculate crazy numbers for log approximations, which I stopped following long ago. :)

FYI: Kasperle’s code, which is pretty cool, goes like this:

foo = lambda n,base: (n or “”) and (str(foo( n / base, base)) + str( n % base))

Notice the way the recursion stops…However, in one-liner code, I prefer assigning the result to a value, rather than assign the lambda and call it. But it’s also possible to do, for instance: x = (lambda y: y+1)(0). But if you ask me, I don’t really like this notation.

Then Imri suggested another idea using sqrt, but I objected since we need math. The truth is that you can do x**0.5 in Python. But eventually his solution wasn’t good enough.

ARRRG, As for now I am giving up :(. If you have another idea, let me know.

Posted in Algorithms, Python | 3 Comments »

Lambdas Forever

Saturday, October 20th, 2007

Ahhh Python, what a splendid scripting language. One of the most likeable features is the anonymous functions, aka Lambda. The lambda is actually a way to write/implement a simple one liner function in-place. The official docs says:

“Lambda forms (lambda expressions) have the same syntactic position as expressions. They are a shorthand to create anonymous functions; the expression lambda arguments: expression yields a function object.”

Instead of implemented the damned comparison function for sorting, you probably all know what I’m talking about:
def ComparisonProc(x, y):
return y – x
list.sort(ComparisonProc)

We can simply do:
list.sort(lambda x, y: y – x) # Descending
and voila.

This is a very simple example. Lambdas are really handy when you want to do one liner devices. Some of them which you manage to stuff in one line and some which you just can’t. However, without lambda it wouldn’t have been possible in the first place.

There are many samples in the Internet. I came up with something, hopefully even useful. Let’s say you want to print all .JPG files on your c:\, including subdirectories. So we have to scan the h.d for all files, then filter those with .JPG extension and afterwards print the result. :) Yes this is all possible in one-liner, let’s see the original code first:

for root, dirs, files in os.walk(‘c:’):
    for i in files:
            if i[-4:] == “.jpg”:
                    print i

The one-liner version:

print filter(lambda name: name[-4:] == ".jpg", reduce(lambda x,y:x+y, [i[2] for i in os.walk('c:')]))

See? Easy :)

Actually now, I have to explain a few more things.
1) We are only interested in the Files list from os.walk, therefore we take the third entry in the result, that’s i[2].

2) The i[2] itself, is a list, and we cannot filter a list of lists with the file names, therefore we have to flatten the lists to a single list containing the file names. This is where the reduce comes in, it will return the accumulated result of all lambdas – each time calling the lambda with the accumulated x and supplying the next item, y. Thus, adding the lists extends the resulting list and flatten them…

3) Now that we the a single list with all file names in the system, we need to filter out the files which are not .JPG. So yet again we use a lambda that checks the last 4 characters in the file name and assures whether it is a .JPG, all other files will be removed from the resulting list.

4) Lastly, print the result. Actually you can use pretty print (module pprint) to print it prettier :)

Yes, Python’s crazy!

So what’s the annoying things with lambdas? They are slow relatively to list comprehensions (which we used to get all lists of file names above). But again, if we are using scripting – are we after speed? I am not sure. Another irritating thing about lambdas is that you cannot assign inside the expression, but then you have reduce.. :)

The worst thing about lambdas is when you use global variables, and let me explain. Since lambdas are evaluated at runtime (I hope I am right here) if you access some variables outside of the lambda, they will get re-evaluated everytime with the lambda itself. Now think that you wanted the lambda to have a specific value when you created the lambda, and then when you really call the lambda, that value was already changed and your result is screwed.

Enough words, let’s see the problem with some code:

>>> x, y = [lambda z: 5+z+i for i in xrange(2)]
>>> x(0)
6
>>> y(0)
6

Oh uh, we’re in trouble!

Do you notice we get the same result for both functions? This is incorrect because they are not supposed to return the same value. Note this:

>>> i
1

So now when both lambdas, x and y, are evaluated they use i as 1. Sucks huh?

>>> i = 3
>>> x(0)
8
>>> y(0)
8
“Use Nested Lambdas, Luke”

x, y = [(lambda v: lambda z: 5 + v)(i) for i in xrange(2)]
>>> x, y = [(lambda v: lambda z: 5 + v)(i) for i in xrange(2)]
>>> x(0)
5
>>> y(0)
6
>>>

The outter lambda is get evaluated immediately and thus leaves the value, and not the pointer to the value, in the code. Next time when the inner lambda is evaluated it uses the value-of(i) and not the value-from-pointer-of(i).

This surely will help someone out there :) And then they say Lambdas are going to be deprecated in Python 3000…

[Updated]

Thanks to Kasperle, here’s another solution to the nested lambdas:

x, y = [lambda z, dummy = i: 5 + dummy for i in xrange(2)]

The drawback is that x and y can now get a second parameter which you potentially let the caller override…Or if we’re talking about Python 2.5 you can use functools.partial.

Posted in Algorithms, Python | 7 Comments »

Insanely Low-Level