Thoughts About Open Source Community In RE

Hello there,
if you wanna know me better and you use diStorm, take 10 minutes from your precious time and please read my post.

If you don’t give a damn about diStorm or open source community, you will waste your time reading this post, so visit here.

Believe me or not, I would have given this code (new version) under BSD. I really wanted to. A proof is that diStorm64 was BSD so far, and still is, however it is deprecated. But I don’t see a good end for BSD. Let me explain. It’s very permissive, everybody can use it and I like it more than GPL for that reason, truely. Why after all I decided on releasing diStorm in a GPL license, and not even LGPL? That’s a good question and I will answer this by telling you my story and more.

I began this project for pure fun and challenge of decoding x86, back in mid 2003 already. Soon enough I got some basic framework that could accept a stream, look in some data structure and fetch an instruction, then decode the operands. Sounds easy, right? Because it is easy. But that was only for integer instructions. After a few months I added FPU support. This is where I started to hate the x86 machine code, really. It’s a pain in the arse. Only at that time I started to look around and to see what other (disassembler) libraries we got on the Internet, it was already 2004, and I still enjoyed coding diStorm for the fun and sport of it, 100% from scratch, always. I knew from day one that I am going to make it an open source project. And yet, it wasn’t so useful, there were better disassemblers out there. I’m talking about binary stream diassemblers, mind you, not the GUI wrapped ones, with high level features. Anyway, diStorm wasn’t any special yet. Therefore I had to work hard, in my free time, no complaints ever. And added all those SSE instructions set. Like 5 sets eventually were added to diStorm, in addition to new sets every now and then in other computing fields. Honestly, I doubt people use diStorm for SSE, but you never know. Besides the goal of diStorm was to be a complete product, top quality and optimized, and I achieved them all within time. diStorm was opened source in the beginning of 2006. A few months later I added AMD64 support, and then diStorm was the first open source disassembler library to support it.

All the while I got lots of emails about diStorm. Some were about asking help of how to use it, some were about defects in specific isntructions, etc. And even two critical bugs, one is code regression that I put a bug in the code accidentally, and the other was some memory leak in the Python module, which I happened to fix before already.

The most appreciated work from the community was about the sample projects. People helped me with the code to make it more useful, and better code for each platform (there are Win32 and Linux separated projects). But never anything about diStorm’s code itself. Maybe the project is too complex. Maybe there was no need, overall it was stable and mature. I don’t know.

Since 2005 I got more than 50,000 downloads of the variants of diStorm (the sample projects, the full source code, the library, the Python modules, etc). It is a lot, like 10K a year. Don’t forget that after all it’s a mere disassembler, not some crazy application, and it’s even a library, so only developers can use it. Though later I added the flat disassembler project compiled and which can be downloaded in a binary form. And what we learn from this? Nothing, nada! You can never trust statistics as this, and it doesn’t mean much. Cause there are new releases and the same person can download the project a few times within some time, to get the latest version, etc. So I can’t have any info about how many people/companies use it around the world.

My own goal was to make diStorm the best disassembler out there. Only you can judge. I know what it’s worth.

Sometimes I had the doubts about this issue. And it gave me the inspiration to go on and bring the next generation disassembler library to the world, one that afterwards I can retire, in a way, and finally start to enjoy the fruits of my hard work with coding code analysis tools myself in the future. About time, yey, no more excuses, no more stupid string parsing, totally efficient.

I see many people complain about the general status of tools in the Reverse Engineering field. Not many people open their code. And the rest make money out of it. And I want to make a better community in this field. I really tried to do so, but without success. People that use diStorm either only use it as is, and they won’t open their code, because they want to keep their precious knowledge to themselves, or sell it for money. That’s legitimate. You work, you get money.  And it also gives the opportunity for tools to exist, otherwise they won’t be there at all. But I think we can do better. I am doing better myself this way, I belive so and therefore I open the code in GPL. GPL is ugly and the only reason I am for it is because I want the community to help me with this project and the coming one. Ohhhh, you will love the next project. It’s not a library anymore but a whole studio…dreams come true. I make them coming true on my end, but I need help. For crying out loud.

I, hereby, am asking from the community, the people out there, that do this stuff for fun or profit to help, to contribute back to the community. I can’t do it all myself, it will take years. Though I am willing to do it myself, and that’s what I do. But then WHAT FOR?! So some companies can enjoy my work and get money on my expense? NO THANKS.

Community, Community, Community, this is key, what the whole issue is about. You are saying you need open source tools, no problem, but share. Let’s do it together. Parsing strings is not the way. Finally we got a good weapon and let’s build a framework on top of it. We got Olly, we got WinDbg, and we got IDA. None of them is opened source. Each has a clear win in its niche. I think there is a room, a NEED, for something new, free and open source. If I am not going to get help this time, I am not going to open source the next project, because it’s pointless and by now you should know exactly why.

You know what, maybe it’s all my fault. Maybe diStorm’s code is TOO complex. (What do you expect though?). Maybe the stupid and ugly diStorm’s page is not easy to track. Otherwise why I see someone who spent a few hours taking a disassembler out of a bigger project and make it work for Windows kernel, while you can do that in diStorm in two clicks?

Maybe nobody gives a f@ck about it, probably, just a stupid disassembler. But no more. This is the time to make a change, a big one. It might be a disassembler, or anything else, but it just shows the attitude of the community.And don’t kid yourself, everybody looks for code analysis stuff, or eventually write them up on their own. C’est tout. Mailling list or not, if we are not going to help each other and only complain that there are not enough open source projects in the RE field, nothing good is gonna happen.


Gil Dabah, Arkon

P.S – You can start by forwarding a link to this post.

19 Responses to “Thoughts About Open Source Community In RE”

  1. djTeller says:

    It seems like the community think that diStorm is stable, but I think that too :P
    I’ll be glad to help in your next project.

  2. Ahmed Essam says:

    I am with you, you have my contacts, call when ever you need me, I never used diStorm before but my free time is urs.

  3. arkon says:

    Hey Teller and Ahmed.
    Once I get more info I will let you guys know.
    Thanks for supporting anyway.

  4. developer says:

    Please leave your contacts or drop me an e-mail. I have started a similar project (disassembler studio) few times during last 3 years, but each time it interrupts after 2-3 months. I will be glad to help you.

  5. AlienRancher says:

    The older versions will still be covered by the BSD license right? so the people that don’t want to share won’t share, will use the older copies or just violate the new license.

    Others will instead use something inferior on liberal license, telling themselves that they can fix the problems.

    Others will just pay money to somebody who delivers a black box solution, maybe even based on diStorm (!!)

    All accounted it might be worse. And your point about not knowing what is really happening with your product still applies. At least for most of the users.

    The product I work on is a large open source project with a bsd-like license. I doubt we can use diStorm (ianal) now. Not that we need it.

    Yes, it is niche, complex and diStorm has few (or none?) bugs. The bsd model is much less useful when the project is so mature. Useful in the sense of helping improving it. bsd is still useful to foster usage.

    But you are welcome to experiment with GPL. You can always change it to something else or dual-license or whatever. Or just make the studio one GPL and diStorm bsd.

    Anyhow, I hope you keep the good work on diStorm. I share your frustration about the community.

  6. arkon says:

    Of course they are, can’t change that, and I’m happy with that decision back then.
    If users prefer to use the old diStorm, and probably many will, no problem. They don’t get bug fixes and new instruction sets. And obviously the structure output, though it’s very complex, and I’m not sure how many people out there will be able to use it. That’s going to be interesting to see. Only the ones who really need it will be able to use it, I bet.

    There’s no point that somebody will deliver diStorm as a blackbox, I still sell the new version, dual licensed, to specific customers. If companies want some real disassembler, there’s no point they will develop it in house, it is cheaper for them to purchase a license.

    Nothing is perfect, if you can’t use diStorm because of its license, and it is a big open source project, and it may give a boost, I’m sure FSF will make the right decision and exceptions.

    Yeah, I have to try as long as I can, in my way. If I see it doesn’t work well, I will need to think what’s next.

    Thanks for supporting.

  7. ChromeSilver says:

    Hi Arkon,
    @(y)our service!

  8. […] Thoughts about Open Source Community in RE – A few words about the diStorm disassembler and the community […]

  9. Longpoke says:

    “I see many people complain about the general status of tools in the Reverse Engineering field. Not many people open their code. And the rest make money out of it.”

    Can’t agree more. Ollydbg was half decent, but it’s closed source and it is no longer relevant to any real task. IDA… closed source, no thanks. Any good reverse engineer will need/want to edit his tools from time to time.

    I’d like to help, but I refuse to help anyone program C/C++. The whole reason I need reverse engineering tools is to verify the security of systems, and I have no interest in auditing the actual RE tools for vulnerabilities… Not to mention strange corruption bugs and then you don’t know weather your disassembler is broken, or the stream to it, or the front end.. C/C++ effectively diminish the benefits of open source for me, they aren’t worth the tiny speed boost, and should only be used on the parts that they are absolutely essential for.

  10. arkon says:

    Alright then, what are your options? Java? com’on, no way.
    C#, a very good language, but then you’re limited only to Windows. Mono.. hmmm might be an option in the future.

    Python? Not for a real heavy framework, with all due respect to it.
    So, again, I’m left with C/C++.
    Enlighten me ;)

  11. Longpoke says:

    Java is a _much_ more sane option than C/C++ in this case, as it is about as fast as C/C++ or faster, depending on what exactly you’re doing, will work on any O/S without fooling around with porting, and will save 90% of your time. There are already lots of reversing tools written in Python. C#… No, that is and always will be worse than C/C++ in terms of portability. I don’t see what’s wrong with Java or Python.. If you are making a debugger you just need a few JNI (Java) or C extension (Python) calls.

  12. XTZGZoReX says:

    C# would be just fine.

    You’d be needing the low-level file I/O for loading the file, right? These are basic APIs supported by Mono -long- ago. You’d be needing a GUI, too. Mono has excellent support for WinForms already. Speed? Not as fast as C++, no doubt – everything comes at a cost. Personally, I find C# better because the framework/APIs you can expose to others will be much better organized and much easier to customize than in a C++ project (reflection being a reason). No to mention, C# debugging is way easier than C++.

  13. Longpoke says:

    Yes, C# is fine too, there are modern 3d games written in it, which is proof that C/C++ are dead. But I’m not sure how safe the licensing on C# is… I hear a lot of people complaining about it because of that. Java, Ruby, Python, etc are free and open source, yes there is Mono for C#, and yes, obviously it has file I/O, I’ve never seen a general purpose programming language that has no way to do file I/O… and I wouldn’t call any form of file I/O that you will need in a disassembler low level, all you need to do is read the contents to a buffer.. The other thing about C# is I don’t know how easy it is to make portable code in it, which kind of defeats the purpose of using a managed language; keep in mind that we are reverse engineers here, lots of us don’t use Windows, and it’s just naive to write code that’s no portable when it’s not even doing anything low level.

  14. Longpoke says:

    P.S: If anyone mentions Perl or BASH, I will punch them in the face. Actually, I just remembered that metasm already exists, so I guess I’ll be using that on my next work.

  15. Saifi Khan says:

    Hi Arkon:

    After reading your very long blog post, it still does not explain why you chose GPL, since GPL does not address the concerns you’re having in the first place ie. community !

    Community forms when a group of individuals are interested in working and sharing knowledge on some topic of interest to them. This usually happens by trying to reach out to people, blogs, events and network together.

    GPL conditions “apply” only if a modification or distribution is made !

    An individual or company can “use” your library without modification or “modify” and use it internally. How does that help ‘community’ formation ?

    Please explain.

    Saifi Khan.

  16. arkon says:

    Hi Saifi,

    the way I see it –
    companies use diStorm for their closed source products which they also sell. They change diStorm and I can’t enjoy their modifications. Here and there people report bugs, that’s great. But I want more than that. By limiting the license, they will either have to pay to use diStorm, or use it as in open source – that will give more to the community if people add functionality to diStorm and publish, for instance.

  17. Saifi Khan says:

    Arkon, Shalom !

    You write “by limiting the license, they will have to pay to use diStorm’.

    Pay ???
    Well, GPL license ensures that the effective price of your software is zero ! Please review the GPL license terms once. GPL is for ‘free software’ and not open source.

    That brings back the question of community, which you haven’t answered. Your initial gripe in the blog post was ‘you are doing this alone and want to create a community’. Please explain.

    Consider this, a user needs to modify your code to fix a bug and by the terms of the license, (s)he’s supposed to make the work public and the patches. The user doesn’t want to spend his time, so (s)he files a bug report. You the ‘original’ developer is happy to see a bug report and works on the fix and essentially modifies the code and makes a new release. The user just downloads the updated release. The user got the benefit of the update without making the modifications him/her(self).

    What did your GPL protect in this case ? Please explain.


  18. anonymous says:

    The reason why it’s hard to build a community upon your single-disassembler is because your disassembler only solves one problem. A community built around a solution to one problem has no real room to grow. The only thing you can do in a situation like this is to orphan the project, and allow someone who wants to maintain it to take over. Good luck.

  19. arkon says:

    Finally someone brave enough to say some truth. That being said, is not the whole issue. You forget that’s a symptom to the problem here. It might be that in a few years I will get to work on the Studio thingy, and then we will see how things really are.

    I’m closing this post for comments.