One of the fun things to do with applications is to bypass their copy-protection mechanisms. So I want to share my experience about some iPad application, though the application is targeted for the Jailbroken devices. It all began a few days ago, when a friend was challenging me to crack some application. I had my motives, and I’m not going to talk about them. However, that’s why the title says non-profit. Or maybe when they always say “for profit” they mean the technical-knowledge profit.
So before you start to crack some application, what you should do is see how it works, what happens when you run it, what GUI related stuff you can see, like dialog boxes or messages that popup, upon some event you fire. There are so many techniques to approach application-cracking, but I’m not here to write a tutorial, just to talk a bit about what I did.
So I fired IDA with the app loaded, the app was quite small, around 35kb. First thing I was doing was to see the imported functions. This is how I know what I’m going to fight with in one glare. I saw MD5/RSA imported from the crypto library, and that was like “oh uh”, but no drama. Thing is, my friend purchased the app and gave me the license file. Obviously it’s easier with a license file, otherwise, sometimes it’s proved that it’s impossible to crack software without critical info that is encrypted in the license file, that was the issue in my case too. Of course, there’s no point in a license file that only checks the serial-number or something like that, because it’s not enough. So without the license file, there wasn’t much to do.
For some reason IDA didn’t like to parse the app well, so I had to recall how to use this ugly API of IDC (the internal scripting language of IDA), yes, I know IDA Python, but didn’t want to use it. So my script was fixing all LDR instructions, cause the code is PICy so with the strings revealed I could easily follow all those ugly objc_msgSend calls. For Apple’s credit, the messages are text based, so it’s easy to understand what’s going on, once you manage to get to that string. For performance’s sake, this is so lame, I rather use integers than strings, com’on.
Luckily the developer of that app didn’t bother to hide the exported list of functions, he was busy with pure protection algorithm in Objective-C, good for me.
So eventually the way the app worked (license perspective) was to check if the license file exists, if so, parse it. Otherwise, ask for a permission to connect to the Internet and send the UDID (unique device ID) of the device to the app’s server, get a response, and if the status code was success, write it to a file, then run the license validator again.
The license validator was quite cool, it was calling dladdr on itself to get the full path of the executable itself, then calculating the MD5 of the binary. Can you see why? So if you thought you could easily tamper with the file, you were wrong. Taking the MD5 hash, and xoring it in some pattern with the data from the license file; Then decrypting the result with the public key that was in the static segment, though I didn’t care much about it. Since the MD5 of the binary itself was used, this dependency is a very clever trick of the developer, though expected. So I tried to learn more about how the protection works.
Suppose the license was legit, the app would take that buffer and strtok() it to tokens, to check that the UDID was correct. The developer was nice enough to call the lockdownd APIs directly, so in one second I knew where and what was going on around it. In the beginning I wanted to create a proxy dylib for this lockdownd library, but it would require me to patch the header of the mach-o so the imported function will be through my new file – but it still requires a change to the file, no good. So the way it worked with the decrypted string – it kept on tokenizing the string, but this time, it checked for some string match, as if someone tampered with the binary, the decryption would go wrong and the string wouldn’t compare well. And then it did some manipulation on some object, adding methods to it in runtime, with the names from the tokenized string, thus if you don’t have a license file to begin with, you don’t know the names of the new methods that were added. One star for the developer, yipi.
All in all, I have to say that I wasn’t using any debugger or runtime tricks, everything was static reversing, yikes. Therefore, after I was convinced that I can’t ignore the protection because I lack of the names of the new methods, and I can’t use a debugger to phish the names easily. I was left with one solution, as I said before – faking the UDID and fixing the MD5.
What I really cared about for a start, was how the app calculates the MD5 of itself:
Since the developer retrieved the name of the binary using dladdr, I couldn’t just change some path to point to the original copy of the binary, so when it hashes it, it would get the expected hash. That was a bammer, I had to do something else, but similar idea… I decided to patch the file-open function. The library functions are called in ARM mode and it’s very clear. The app itself was in THUMB, so it transitions to ARM using a BX instruction and calls a thunk, that in order will call the imported function. So the thunk function is in ARM mode, thus 4 bytes per instruction, very wasteful IMHO.
The goal of my patches was to patch those thunks, rather than all the callers to those thunks. Cause I could end up with a dozen of different places to patch. So I was limited in the patches I could do in a way. So eventually I extended the thunk of the file-open and made R0 register point to my controlled path, where I could guarantee an original copy of the binary, so when it calculated the MD5 of it, it would be the expected hash. Again, I could do so many other things, like planting a new MD5 value in the binary and copy it in the MD5-Final API call, but that required too much code changes. And oh yes, I’m such a jackass that I didn’t even use an Arm-assembler. Pfft, hex-editing FTW :( Oh also, I have to comment that it was safe to patch the thunk of file-open, cause all the callers were related to the MD5 hashing…
Ok, so now I got the MD5 good and I could patch the file however I saw fit. Patching the UDID-strcmp’s wasn’t enough, since the license wasn’t a “yes/no” check, it had essential data I needed, otherwise I could finish with the protection in 1 minute patch (without going to the MD5 hassle). So I didn’t even touch those strcmp’s.
RSA encryption then? Ahhh not so fast, the developer was decrypting the xored license with the resulted MD5 hash, then comparing the UDID, so I got the license decrypted well with the MD5 patch, but now the UDID that was returned from the lockdownd was wrong, wrong because it wasn’t corresponding to the purchased license. So I had to change it as well. The problem with that UDID and the lockdownd API, is that it returns a CFSTR, so I had to wrap it with that annoying structure. That done, I patched the thunk of the lockdown API to simply return my CFSTR of the needed UDID string.
And guess what?? it crashed :) I put my extra code in a __ustring segment, in the beginning I thought the segment wasn’t executable, because it’s a data. But I tried to run something very basic that would work for sure, and it did, so I understood the problem was with my patch. So I had to double check it. Then I found out that I was piggy-backing on the wrong (existing) CFSTR, because I changed its type. Probably some code that was using the patched CFSTR was expecting a different type and therefore crashed, so I piggy-backed a different CFSTR that wouldn’t harm the application and was a similar type to what I needed (Just a string, 0x7c8). What don’t we do when we don’t have segment slacks for our patch code. :)
And then it worked… how surprising, NOT. But it required lots of trial and errors, mind you, because lack of tools mostly.
End of story.
It’s really hard to say how I would design it better, when I had my chance, I was crazy about obfuscation, to make the reverser desperate, so he can’t see a single API call, no strings, nothing. Plant decoy strings, code, functionality, so he wastes more time. Since it’s always possible to bypass the protections, if the CPU can do it, I can do it too, right? (as long as I’m on the same ring).