Archive for April, 2008

JavaScript Sucks

Tuesday, April 29th, 2008

I really know many languages pretty well, but this language is really ugly or stupid or what not. So many features are only “hacks”, browsers do whatever they want with the code differently from each other and there’s chaos about JS everywhere you go.

For example, what we call a ‘dictionary’, which is an associate array is a big hack in the language. It is practically an object which you can set properties, and then iterate over them. There’s no formal way to remove a key from the dictionary, like you would expect in a scripting language; by doing myDict.remove(“key”). You will have to do delete myDict.key. Not mentioning how to know if you have any keys in the dictionary, because who said you have the length property? Well, if you think you have it, then you’re wrong, that’s because you used an array as a dictionary instead of creating an object using { }.

Another thing I encountered was that if you have a dictionary with the last defined element ends with a comma, then the browser (IE) will shout at you while other browsers eat it well. It reminds me the macro’s in C/C++ that you don’t know where’s the originating code which caused the problem, since it gets compiled after it’s substituted… So {a:1,} will kick.

Another ugly thing is this fake OOP, now who are you kidding? Adding a special use for the “this” keyword, but otherwise everything else is just nested function, err sorry, methods. This is another ugly hack, and some people even use inheritance. Do me a favor. The errorprone “class” that you declare will probably have memory leaks, because the methods were really defined as nested rather than using something like MyClassName.prototype.myMethodName, which will certainly work better and not get allocated per instance. Did you say private member? Oh yeah, right. That’s what you think and this time you’re right. Because they are local variables to the “class” which is really a function that gets run when you create an instance. However, you don’t have control over public/readonly, etc, which is pretty much useful. So constructor is free of charge because it’s the code in the “class” function, where you also define the private variables. And I won’t call them private members. Now you say, “of course, there’s no need for a destructor, a scripting language has a GC”. Well, that’s right, but when an element points to code, using onClick for example, and that handler has a variable that points to that same element, then you’re in a circular trouble ;) So this time you might want to have a destructor right? Or having some function that will be called on unload so you can null() a few variables to break the circular references…But yes, this problem might happen in many environments, but Java for the sake of conversation solve this one unlike Python, AFAIK.

Now why the heck browsers need to compile (yes, in a way) code??? We just all grew up into believing that’s something normal, but stop and give it a thought. I guess those guys didn’t hear about standards.

You can even open a new nested block using curly braces, but all the variables you declare there are become globals. So you end up deleting some objects you have to manually. Now don’t start with why you wanna delete a variable, there are good reasons for that sometime and that’s another story.

Did you know about javascript compiler time machine? Ahh of course not, let me show you:

var a = “DEFINED”;

function f() {
 alert(a);
 var a = 5;
}

Will this code snippet open an alert with a text of “DEFINED”? No, now keep on reading.

If you run that code snippet above you will get an exception with “a is undefined”, now the compiler or whatever freak under there sees the a, which is really defined in the global scope, right? Yes, it is, seriously. But then it sees later on that the ‘a’ variable is being defined in the scope of the function ‘f’ and decided to make the first one undefined. Make an experiment and remove the ‘var’ from the definition of the ‘var a = 5;’ and see for yourself the results.

And there are more and more quirks in this language that I will leave for another time. So what do you think, is Silverlight the best next thing?

Signed Division In Python (and Vial)

Friday, April 25th, 2008

My stupid hosting company decided to move the site to a different server blah blah, the point is that I lost some of the recent DB changes and my email hasn’t been working for a week now :(

Anyways I repost it. The sad truth was that I had to find the post in Google’s cache in order to restore it, way to go.

Friday, April 18th, 2008:

As I was working on Vial to implement the IDIV instruction, I needed to have a signed division operator in Python. And since the x86 is a 2’s complement based, I first have to convert the number into Python’s negative (from unsigned) and only then make the operation, in my case a simple division. It was supposed to be a matter of a few minutes to code this function which gets the two operands of IDIV and return the result, but in practice it took a few bad hours.

The conversion is really easy, say we mess with 8 bits integers, then 0xff is -1, and 0×80 is -128 etc. The equation to convert it to a Python’s negative is: val – (1 << sizeof(val)*8). Of course, you do that only if the most significant bit, sign bit, is set. Eventually you return the result of val1 / val2. So far so good, but no, as I was trying to feed my IDIV with random input numbers, I saw that the result my Python’s code returns is not the same as the processor’s. This was when I started to freak out. Trying to figure out what’s the problem with my very simple snippet of code. And alas, later on I realized nothing was wrong with my code, it’s all Python’s fault.

What’s wrong with Python’s divide operator? Well, to be strict, it does not round the negative result toward 0, but towards negative infinity. Now, to be honest, I’m not really into math stuff, but all x86 processors rounds negative numbers (and positive also to be accurate) toward 0. So one would really assume Python does the same, as would C, for instance. The simple case to show what I mean is: 5/-3, in Python results in -2. Rather than -1, as the x86 IDIV instruction is expected and should return. And besides -(5/3) is not 5/-3 in Python, now it’s the time you say WTF. Which is another annoying point. But again, as I’m not a math guy, though I was speaking with many friends about this behavior, that equality (or to be accurate, inequality) is ok in real world. Seriously, what we, coders, care about real world math now? I just want to simulate a simple instruction. I really wanted to go and shout “hey there’s a bug in Python divide operator” and how come nobody saw it before? But after some digging, this behavior is really documented in Python. As much as I would hate it and many other people I know, that’s that. I even took a look at the source code of the integer division algorithm, and saw a ‘patch’ to fix the numbers to be floored if the result is negative because of C89 doesn’t define the rounding well enough.

While you’re coding something and you have a bug, you usually just start debugging your code and track it down and then fix it easily while keeping on working on the code. Because you’re in the middle of the coding phase. There are those rare times that you really get crazy when you’re absolutely sure your code is supposed to work (which it does not) and then you realize that the layer you should trust is broken (in a way). Really you want kill someone  … being a good guy I won’t do that.

Did I hear anyone say modulo?? Oh don’t even bother, but this time I think that Python returns the (math) expected result rather than the CPU. But what does it matter now? I really want only to imitate the processor’s behavior. So I had to hack that one too.

The solution after all, was to make the Python’s negative number to be absolute and remember its original sign, that we do for both operands. And then we make an unsigned division and if the signs of the input are not the same we change the sign of the result. This is because we know that the unsigned division works as the processor does and we can then use it safely.

res = x/y; if (sign_of_x != sign_of_y) res = -res;

The bottom line is that I really hate this behavior in Python and it’s not a bug, after all. I’m not sure how many people like me encountered this issue. But it’s really annoying. I don’t believe they are going to fix it in Python 3, never know though.

Anyway, I got my IDIV working now, and that was the last instruction I had to cover in my unit tests. Now It’s analysis time :)

Debugging Symbols

Saturday, April 5th, 2008

We all like to use the PDB files when they are available. Sometimes we have to get those from the Internet from MS. Usually I download the whole package of symbols for my current OS and get done with it. The problem is that sometimes after updates and the like, they are out of date and then the files I got are not relevant anymore. And as I use WinDbg I decided to set the symbols path variable in the system environment, to have it supported for other apps as well. Though I really ask myself how come I haven’t done it before. Because at my work, I already use it for a long time now…

Anyhow, I set that variable to: SRV*http://msdl.microsoft.com/download/symbols;C:\symbols;

And I was happy then that everything loads automatically. Afterwards I noticed that everytime I start debugging code in MSVS it accessed the inet for something, blocked the whole application for a few seconds and then resumed with starting my application and let me debug it. The point was that it was my own application with full source and everything and it still accessed the inet everytime, maybe for checking timestamps with loaded modules, etc. It even caused some problems like saying that my source isn’t the same like the binary I want to debug and it hasn’t let me use the source when debugging that code. So after a few times with this same confusion, I couldn’t continue work like that anymore and I tried to think what I changed that caused this weird debugging behaviors and only then it got to my mind that I added that extra variable in the environment. So I took a look at the variable and by a hint from a friend, I switched the places of the local symbols directory with the http address. And since then I don’t have any weird seeks to the inet to get/check PDBs when not required and everything run fast as normal as before.

That’s it, just wanted to share this issue. Of course, I don’t blame any application for using the first address in the variable first, because it’s up to the user how to define the priorities. It is just that I didn’t think it would matter… To learn I was wrong.

AAA/AAS Quirk

Thursday, April 3rd, 2008

As I was working on the simulation of these two instructions I found that they have some quirk, although the algorithms for these instructions are described in Intel’s specs, which (seems to) make the output defined for all inputs, it is not the case. Everytime I finish writing an implementation for a specific instruction I add that instruction to my unit tests. The instruction is being simulated with random(and some smarter) input and then checked against pure native execution to see if the results are correct. So this way I found a quirk for a range of input that reveals how the instruction is really implemented (microcode stuff prolly) rather than how it’s documented.

AL = AL + 6 is done when the AF is set or low nibble of AL is above 9. According to the documentation the destination register is AL, but in reality the destination register is AX. Now how do we know such a thing?

If we try the following input:
mov al, 0xff
aaa

The result will be 0x205, rather than 0x105 (which is what we expect according to the docs).

What really happens is that we supply a number that when added with 6 creates a carry into AH, thus incrementing AH by 1. Then looking at the docs again, we see that if AL was added with 6, it also increments AH by 1 manually. Thus AH is really incremented by 2. :P

The question is why they do AX = AX + 6, rather then operating on AL. No, actually the biggest question is why I get this same behavior on an AMD processor (whereas I work on an Intel processor). And we already by my last post about SHLD that they don’t work the same in some undefined behavior aspects (although some people believe AMD copied Intel’s architecture and implementation)…

There might be some people who will say that I went too far with testing this instruction, because I, somewhat, supply an input which is not in the valid range (it’s unpacked BCD after all), which therefore I must not rely on the output. The thing is, the algorithm is defined well to receive any input I pass it, hence I expect it to work for even undefined input. Though I believe there is no such a thing as undefined input, only undefined output, and that’s why I implemented my instrction as they both did. Specifically where they both didn’t state anything about undefined input/output, which makes my case stronger. Anyway, the point is they don’t tell us something here, the implementation is not similar to this documented in both AMD/Intel docs.

This quirk works the same for AAS, where instead of doing AL = AL -6, it’s really AX = AX – 6. I also tried to see whether they work on the whole EAX, but I saw that the high word wasn’t changed (by carry/borrow). And I also tried to see if this ill behavior is found in both DAA/DAS, but no.