SmartPointer In C++

March 3rd, 2010

Smart pointers, the way I see it, are there to help you with, eventually, two things: saving memory and auto-destruction. There are plenty kinds of smart pointers and only one type of a dumb pointer ;) I am going talk about the one that keeps a reference count to the data. To me they are one of the most important and useful classes I have used in my code. Also the AutoResource class I posted about, here, is another type of a smart pointer. I fell in love with smart pointers as soon as I learnt about them long time ago. However I only happened to write the implementation for this concept only once, in some real product code. Most of the times I got to use libraries that supply them, like ATL and stuff. Of course, when we write code in high level languages like Python, C#, Java, etc. We are not even aware to the internal use of them, mostly anyway.

This topic is not new or anything, it is covered widely on the net, but I felt the need to share a small code snippet with my own implementation, which I wrote from scratch. It seems that in order to write this class you don’t need high skills in C++, not at all. Though if you wanna get dirty with some end cases, like the ones described in ‘More Effective c++’, you need to know the language pretty well.

As I said earlier, the smart pointer concept I’m talking about here is the one that keeps the number of references to the real instance of the object and eventually when all references are gone, it will simply delete the only real instance. Another requirement from this class is to behave like a dumb pointer (that’s just the normal pointer the language supplies), my implementation is not as perfect as the dumb pointer, in the essence of operators and the operations you can apply on the pointer. But I think for the most code usages, it will be just enough. It can be always extended, and besides if you really need a crazy ultra generic smart pointer, Boost is waiting for you.

In order to keep a reference count for the instance, we need to allocate that variable, also the instance itself, and to make sure they won’t go anywhere as long as somebody else still points to it. The catch is that if it will be a member of the SmartPointer class itself, it will die when the SmartPointer instance goes out of scope. Therefore it has to be a pointer to another object, which will hold the number of references and the real instance. Then a few smart pointers will be able to point to this core object that holds the real stuff. I think this was the only challenge in understanding how it works. The rest is a few more lines to add functionality to get the pointer, copy constructor, assignment operator and stuff.

Of course, it requires a template class, I didn’t even mention that once, because I think it’s obvious.
Here are the classes:

template <class T> class SmartPtr {
public:
  SmartPtr(T o)
  {
    // Notice we create a DataObject that gets an object of type T.
    m_Obj = new DataObj(o);
  }
  // … A few of additional small methods are absent from this snippet, check link below
private:
  // Now, here we define an internal class, which holds the reference count and the real object’s instance.
  class DataObj {
  public:
    DataObj(T o) : m_ReferenceCount(0)
    {
      m_Ptr = new T(o); // First allocate, this time the real deal
      AddRef(); // And only then add the first reference count
    }
    unsigned int AddRef()
    {  return m_ReferenceCount++;  }
    void Release()
    {
      if (–m_ReferenceCount == 0) {
        delete m_Ptr; // Delete the instance
        delete this; // Delete the DataObj instance too
    }
  }
  T* m_Ptr; // Pointer to a single instance of T
  unsigned int m_ReferenceCount; // Number of references to the instance
 };

// This is now part of the SmartPointer class itself, you see? It points the DataObj and not T !
DataObj* m_Obj;
};
 

To see the full source code get it SmartPointer.txt.

I didn’t show it in the snippet above but the assignment operator or copy constructor which get a right hand of a smart pointer class, will simply copy the m_Ptr from it and add a reference to it. And by that, the ‘magic’ was done.

To support multi-thread accesses to the class, you simply need to change the AddRef method to use InterlockedAdd. And to change the Release to use InterlockedSub, ahh of course, use InterlockedAdd with -1.
And then you would be fully thread safe. Also note that you will need to use the returned value of the InterlockedAdd in the Release, rather than compare the value directly after calling the function on it. This is a common bug when writing multi-thread code. Note that if the type object you want to create using the SmartPointer doesn’t support multi-threading in the first place, nothing you can do in the smart pointer method themselves is going to solve it, of course.

I didn’t show it in the snippet again but the code supports the comparison to NULL on the SmartPointer variable. Though you won’t be able to check something like:
if (!MySmartPtr) fail… It will shout at you that the operator ! is not supported. It takes exactly 3 lines to add it.

The only problem with this implementation is that you can write back to the data directly after getting the pointer to it. For me this is not a problem cause I never do that. But if you feel it’s not good enough for you, for some reason. Check out other implementations or just read the book I mentioned earlier.

Overall it’s really a small class that gives a lot. Joy

Undocumented Kernel API Again…

February 24th, 2010

The function I’m going to talk about is nothing new. The annoying thing is that you can’t find it in the WDK. Sometimes you want to know the name of the calling process (suppose its image name is enough). But it can’t be used for security, because you can create a ‘logon.exe’ and run it from the desktop directory, and it will be seen as ‘logon.exe’. Therefore it’s mostly useful for debugging or something.

So once you get a PEPROCESS and you wish to get its image name, you can call PsGetProcessImageFileName. We all know those hacks that scan the current PEPROCESS for ’system’ when the DriverEntry is being called and store the <i>offset</i> for later use. But it’s not really needed anymore.

extern "C" {

extern char* PsGetProcessImageFileName(PRPROCESS p);

}

DbgPrint("Calling process name is: %s\n", PsGetProcessImageFileName(PsGetCurrentProcess()));
 

Retrieving the full path name of a process from kernel can be a b1tch. And I don’t know a good way to do it. Though I think the best way would be to get the ControlArea of the mapped image of that process, but IIRC it needs a KeAttachProcess which sucks… There are many forums which talk about it anyway…

Trying to Pwn Stuff my way

January 30th, 2010

I have been playing CS since 2001 :) Kinda addicted I can say. Like, after I had been in South America for half a year, suddenly I caught myself thinking “ohhh I wish I could play CS”… So I think it means I’m addicted. Anyway I really like that game. A few days ago I was playing on some server and suddenly hl2 crashed. How good is that they generate a crash dump automatically, so I fired up WinDbg and took a look what happened, I found out that some pointer was set to 1, not NULL, mind you. Looking around the crash area I found a buffer overflow on the stack, but only for booleans, so I don’t know what was the point and how it was triggered or who sent it (server or another player). Anyway, since I like this game so much, there is only one thing I don’t like it, the stupid children you play with/against, they curse and TK (team-kill) like noobs. One day I promised to myself that I will pwn those little bastards. Therefore I started to investigate this area of crash, which I won’t say anything about the technical details here, so you won’t be able to replicate it, except that I found a stack buffer overflow. The way from there to pwn the clients who connect to a server I set up is really easy. The down side is that they have to connect to a server I control, which is quite lame, the point is to pwn other players on a remote server, so I still work on that. For me pwning would be to find a way to kick them from the server for instance, I don’t need to execute code on their machines. Besides since I do everything for fun, and I’m not a criminal, I have to mention that it’s for eductional purposes only :) Being the good guy I am, in ZERT and stuff. I just wanted to add that the protocol used to be really hole-y before CS: Source came out, everything was vulnerable, really, you could tell the server that you wanted to upload a file to it (your spray-decal file) with a name longer than 256 characters, and bam, you own the server through a stupid strcpy to a buffer on the stack. But after CSS came out, the guys did a great job and I could hardly find stuff. What I found is in some isoteric parser that the input comes from the server… What was weird is that some functions were protected with a security cookie and some weren’t. I don’t know what configuration those guys use to compile the game, but they surely need to work it out better.

Another thing I’ve been trying to pwn for a long time now, without much success, I have to say, is NTVDM. This piece of software is huge, though most of it is mostly in user-mode, there are lots of related code in kernel. Recently a very crazy bug was found there (which can lead to a privilege escalation), something in the design, of how the kernel transfers control to BIOS code and returns. You can read more here to get a better clue. So it gave me some idea what to do about some potential buggy code I found. Suppose I found a code in the kernel that takes DS:SI and changes it to a flat pointer, the calculation is (DS << 4) + SI. The thing is that DS is 16 bits only. The thing I thought is that with some wizardy I will be able to change DS to have some value above 0xffff. For some of you it might sound impossible, but in 32 bits and playing with pop ds, mov ds, ax and the like, I managed to put random values in the high 16 bits of DS (say it’s a 32 bit segment register). Though I don’t know if WinDbg showed me garbage or how it really worked, or what happened there, I surely saw big values in DS. So since I couldn’t reproduce this behavior in 16 bits under NTVDM, I tried to think of a way to set DS in the VDM Context itself. If you look at the exports of NTVDM you will see a function named “SetDS”, so taking a look of how it works I tried to use it inside my 16 bits code (exploiting some Escape bug I found myself and posted on this blog earlier), I could set DS to whatever arbitary value I wanted. Mind you, I set DS for the VM itself, not the DS of the usermode application of ntvdm.exe. And then I tried to trigger the other part in the kernel which takes my raw pointer and tries to write to it, but DS high 16 bits were zeros. Damn it. Then I gave to it more thought, and understood that what I did is not good enough. This is because once I set DS to some value, then I get to code to execute on the processor for real and then it enters kernel’s trap handler, DS high half gets truncated once again and I lost in the game. So I’m still thinking if it’s spossible. Maybe next step I should try is to invoke the kernel’s trap handler directly with DS set to whatever value I want, but that’s probably not possible since I can’t control the trap frame myself… or maybe I can ;)

diStorm3 Released

January 14th, 2010

Hey people
I just uploaded the source code of diStorm3 to Google Code.
Now you can enjoy SVN too! And of course, better source code and features from the disassembler itself.
It is officially released under the GPLv3.

Special thanks to Michael Rolle, for tons of suggestions, ideas and fixes. Also thanks to many others who reported buggy instructions.

In the next few weeks I will update the source code some more, for more compilers, etc. Going to re-do the webpage of diStorm. Hopefully will have some nice logo too. And the most important thing, I will make a somehwhat tutorial of how-to use the new disassembler with the newest features. Stay tuned!

Gil

Thoughts About Open Source Community In RE

January 4th, 2010

Hello there,
if you wanna know me better and you use diStorm, take 10 minutes from your precious time and please read my post.

If you don’t give a damn about diStorm or open source community, you will waste your time reading this post, so visit here.

Believe me or not, I would have given this code (new version) under BSD. I really wanted to. A proof is that diStorm64 was BSD so far, and still is, however it is deprecated. But I don’t see a good end for BSD. Let me explain. It’s very permissive, everybody can use it and I like it more than GPL for that reason, truely. Why after all I decided on releasing diStorm in a GPL license, and not even LGPL? That’s a good question and I will answer this by telling you my story and more.

I began this project for pure fun and challenge of decoding x86, back in mid 2003 already. Soon enough I got some basic framework that could accept a stream, look in some data structure and fetch an instruction, then decode the operands. Sounds easy, right? Because it is easy. But that was only for integer instructions. After a few months I added FPU support. This is where I started to hate the x86 machine code, really. It’s a pain in the arse. Only at that time I started to look around and to see what other (disassembler) libraries we got on the Internet, it was already 2004, and I still enjoyed coding diStorm for the fun and sport of it, 100% from scratch, always. I knew from day one that I am going to make it an open source project. And yet, it wasn’t so useful, there were better disassemblers out there. I’m talking about binary stream diassemblers, mind you, not the GUI wrapped ones, with high level features. Anyway, diStorm wasn’t any special yet. Therefore I had to work hard, in my free time, no complaints ever. And added all those SSE instructions set. Like 5 sets eventually were added to diStorm, in addition to new sets every now and then in other computing fields. Honestly, I doubt people use diStorm for SSE, but you never know. Besides the goal of diStorm was to be a complete product, top quality and optimized, and I achieved them all within time. diStorm was opened source in the beginning of 2006. A few months later I added AMD64 support, and then diStorm was the first open source disassembler library to support it.

All the while I got lots of emails about diStorm. Some were about asking help of how to use it, some were about defects in specific isntructions, etc. And even two critical bugs, one is code regression that I put a bug in the code accidentally, and the other was some memory leak in the Python module, which I happened to fix before already.

The most appreciated work from the community was about the sample projects. People helped me with the code to make it more useful, and better code for each platform (there are Win32 and Linux separated projects). But never anything about diStorm’s code itself. Maybe the project is too complex. Maybe there was no need, overall it was stable and mature. I don’t know.

Since 2005 I got more than 50,000 downloads of the variants of diStorm (the sample projects, the full source code, the library, the Python modules, etc). It is a lot, like 10K a year. Don’t forget that after all it’s a mere disassembler, not some crazy application, and it’s even a library, so only developers can use it. Though later I added the flat disassembler project compiled and which can be downloaded in a binary form. And what we learn from this? Nothing, nada! You can never trust statistics as this, and it doesn’t mean much. Cause there are new releases and the same person can download the project a few times within some time, to get the latest version, etc. So I can’t have any info about how many people/companies use it around the world.

My own goal was to make diStorm the best disassembler out there. Only you can judge. I know what it’s worth.

Sometimes I had the doubts about this issue. And it gave me the inspiration to go on and bring the next generation disassembler library to the world, one that afterwards I can retire, in a way, and finally start to enjoy the fruits of my hard work with coding code analysis tools myself in the future. About time, yey, no more excuses, no more stupid string parsing, totally efficient.

I see many people complain about the general status of tools in the Reverse Engineering field. Not many people open their code. And the rest make money out of it. And I want to make a better community in this field. I really tried to do so, but without success. People that use diStorm either only use it as is, and they won’t open their code, because they want to keep their precious knowledge to themselves, or sell it for money. That’s legitimate. You work, you get money.  And it also gives the opportunity for tools to exist, otherwise they won’t be there at all. But I think we can do better. I am doing better myself this way, I belive so and therefore I open the code in GPL. GPL is ugly and the only reason I am for it is because I want the community to help me with this project and the coming one. Ohhhh, you will love the next project. It’s not a library anymore but a whole studio…dreams come true. I make them coming true on my end, but I need help. For crying out loud.

I, hereby, am asking from the community, the people out there, that do this stuff for fun or profit to help, to contribute back to the community. I can’t do it all myself, it will take years. Though I am willing to do it myself, and that’s what I do. But then WHAT FOR?! So some companies can enjoy my work and get money on my expense? NO THANKS.

Community, Community, Community, this is key, what the whole issue is about. You are saying you need open source tools, no problem, but share. Let’s do it together. Parsing strings is not the way. Finally we got a good weapon and let’s build a framework on top of it. We got Olly, we got WinDbg, and we got IDA. None of them is opened source. Each has a clear win in its niche. I think there is a room, a NEED, for something new, free and open source. If I am not going to get help this time, I am not going to open source the next project, because it’s pointless and by now you should know exactly why.

You know what, maybe it’s all my fault. Maybe diStorm’s code is TOO complex. (What do you expect though?). Maybe the stupid and ugly diStorm’s page is not easy to track. Otherwise why I see someone who spent a few hours taking a disassembler out of a bigger project and make it work for Windows kernel, while you can do that in diStorm in two clicks?

Maybe nobody gives a f@ck about it, probably, just a stupid disassembler. But no more. This is the time to make a change, a big one. It might be a disassembler, or anything else, but it just shows the attitude of the community.And don’t kid yourself, everybody looks for code analysis stuff, or eventually write them up on their own. C’est tout. Mailling list or not, if we are not going to help each other and only complain that there are not enough open source projects in the RE field, nothing good is gonna happen.

GOOD LUCK!

Gil Dabah, Arkon

P.S – You can start by forwarding a link to this post.

RSS Feeder

January 2nd, 2010

Hey guys,
apparently RSS was broken for this blog, since I redirected it to feedburner. But now everything seems to work once again. I appareciate that anonymous fellow’s comment ;)
Gil

diStorm3 – News

December 29th, 2009

Yo yo yo… forgot to say happy xmas last time, never too late, ah? :)

This time I wanted to update you about diStorm3 once again. Yesterday I had a good coding session and I added some of the new features regarding flow control. The decode function gets a new parameter called ‘features’. Which is a bit field flag that lets you ask the disassembler to do some new stuff such as:

  1. Stop on INT instructions [INT, INT1, INT3, INTO]
  2. Stop on CALL instructions [CALL, CALL FAR]
  3. Stop on RET instructions [RET, RETF, IRET]
  4. Stop on JMP instructions [JMP, JMP FAR]
  5. Stop on any conditional branch instructions [JXXX, JXCX, LOOPXX]
  6. Stop on any flow control (all of the above)

I wasn’t sure about SYSCALL and the like and UD2, for now I left them out. So what we got now is the ability to instruct the disassembler to stop decoding after it encounters one of the above conditions. This makes the higer disassembler layer more efficient, because now you can disassemble code by basic blocks. Also building a call-graph or branches-graph faster.

Note that now you will be able to ask the disassembler to return a single instruction. I know it sounds stupid, but I talked about it already, and I had some reasons to avoid this behavior. Anyway, now you’re free to ask how many instructions you want, as long as the disassembler can read them from the stream you supply.

Another feature added is the ability to filter non-flow-control instructions. Suppose you are interested in building a call-graph only, there’s no reason that you will get all the data-control instructions, because they are probably useless for the case. Mixing this flag with ‘Stop on RET’ and ‘Stop on CALL’, you can do nice stuff.

Another thing is that I separated the memory-indirection description of an operand into two forms. First of all, memory indirection operand is when an instruction reads/writes from/to memory. Usually in Assembly text, you will see the brackets characters surrounding some expressions. Something like: MOV [EDX], EAX. Means we write a DWORD to EDX pointer. If you followed me ’till here, you should know exactly what I’m talking about anyway.

When you get the result of such instruction from diStorm3, the type of the operand will be SMEM (stands for simple-memory), which hints there’s only one register in the memory-indirection operand. Although it doesn’t hint anything about the displacement, that’s that offset you usually see in the brackets. Like MOV [EDX+0x12345678], EAX. So you will have to test if the displacement exists in both forms. The other form is MEM (Normal memory indirection, or probably should be called ‘complex’) since it supports the full memory indirection operand, like: MOV [EAX*4 + ESI + 0x12345678], EAX. Then you will have to read another register that supplies the base register, in addition to the index register and scale. Note that this applies for 16 bits mode addressing as well, when you can have a mix of ['BX+SI]‘ or only ‘[BX]‘. Also note that sometimes in 32/64 bits mode, you can have a SIB byte, that sets only the base register and the index register is unused, but diStorm3 will return it as an SMEM, to simplify matters. This way it’s really easy to read the instruction’s parameters.

Another feature for text formatting is the ability to tell the disassembler to limit the address to 16 or 32 bits. This is good since the offsets are 64 bits today. And if you encounter an instruction that jumps backwards, you will get a huge negative value, which won’t make much sense if you disassemble 16 bits code…

diStorm3 still supplies the bad old interface. And now it supports two new additional functions. The decompose function, which returns the structures for each instruction read. And another function that formats a given structure into text, which is pretty straight forward. The text format is not an accurate behavior of diStorm64, it’s more simplified, but good enough. Besides I have never heard any special comments about the formatting of diStorm64, so I guess it doesn’t matter much to you guys. And maybe maybe I will add AT&T syntax later on.

Another field that is returned now, unlike diStorm64, is the instruction-set-class type of the instruction, with very broad categories, like Integer instructions, FPU instructions, SSE instructions, and so on. Still might be handy. And the hint about the flow-control type of the instruction.

Also I changed tons of code, and I really mean it, the skeleton is still the same, but the prefixes engine works totally different now. Trying to imitate a real processor this time. By including the last prefix found of that prefix-type. You can read more about this, here. I made the code way more optimized and eliminated double code and it’s still readable, if not for the better. Also I changed the way instruction are fetched, so the locate-instruction function is much smaller and better.

I’m pertty satisfied with the new version of diStorm and hopefully I will be able to share it with you guys soon. Still I got tons of tests to do, maybe I will add that unit-test module in Python to the proejct so you can enjoy it too, not sure yet.

Also I got a word from Mario Vilas, that he is going to help with compiling diStorm for different platforms, and I’m going to integrate his new Python wrappers that use ctypes, so you don’t need the Python extension anymore. Thanks Mario! ;) However, diStorm3 has its own Python module for the new structure output.

If you have more ideas, comments, complaints or you just hate me, this is the time to say so.
Cheers, happy new year soon!
Gil

Terminate Process @ 1 byte

December 28th, 2009

In the DOS days there was this lovely trick that you could just ‘ret’ anytime you wanted and the process would be terminated, I’m talking about .com files. That’s true that there were no real processes back then, more like a big chaotic jungle. Anyways, how it worked? Apparently the RET instruction, popped a zero word from the stack, and jumped to address 0. At address 0, there was this PSP (Program Segment Prefix) which had lots of interesting stuff, like command line buffer, and the like. The first word in this PSP (at address 0 of the whole segment too) was 0×20CD (instruction INT 0×20), or ‘Terminate Process’ interrupt. So branching to address 0 would run this INT 0×20 and close the program. Of course, you could execute INT 0×20 on your own, but then it would cost another byte :) But as long as the stack was balanced it seemed you could totally rely on popping a zero word in the stack for this usage. In other times I used this value to zero a register, by simply pop ax, for instance…

Time passed, and now we all use Windows (almost all of us anyway) and I came with a similar trick when I wrote Tiny PE. What I did was to put a last byte in my code with the value of 0xBC. Now usually there is some extra bytes following your code, either because you have data there or page alignment when the code section was allocated by the PE loader. This byte really means “MOV ESP, ???”. Though we don’t know what comes next to fill ESP with (some DWORD value), probably some junk, which is great.
At the cost of 1 byte, we caused ESP to get some uncontrolled value, which probably doesn’t map to anywhere valid. After this MOV ESP instruction was executed, nothing happens, alas, the next instruction is getting executed too and so on. When the process gets to terminate you ask? That’s the point we have to wait for one of two conditions to occur. Either by executing some junk instruction that will cause an access violation because it touched some random address which is unmapped. Or the second option is that we run out of bytes to execute and hit an unmapped page, but this time because of the EIP pointer itself.
This is where the SEH mechanism comes in, the system sees there’s an AV exception and tries to call the safe-exception handler because it’s just raised an exception. Now since ESP is a junk really it’s in an unrecoverable state already, so the system terminated the process. 1 byte FTW.

Short Update

December 27th, 2009

I’ve slowed down with posting here, been working hard recently both on diStorm and real life job. Hopefully I will be able to release diStorm3 soon (matter of weeks for a start). I’m almost finished with coding everything I wanted, though I’m still left with the features you guys asked for and tons of tests since I also changed lots of code and added new AVX instruction set.

Opening a file by ID – FILE_OPEN_BY_FILE_ID

December 25th, 2009

Sample code to open a file by its file-id. Had to use it for some tests and thought it might be useful for other people out there.

#include windows.h

typedef ULONG (__stdcall *pNtCreateFile)(
  PHANDLE FileHandle,
  ULONG DesiredAccess,
  PVOID ObjectAttributes,
  PVOID IoStatusBlock,
  PLARGE_INTEGER AllocationSize,
  ULONG FileAttributes,
  ULONG ShareAccess,
  ULONG CreateDisposition,
  ULONG CreateOptions,
  PVOID EaBuffer,
  ULONG EaLength
);

typedef ULONG (__stdcall *pNtReadFile)(
        IN HANDLE  FileHandle,
        IN HANDLE  Event  OPTIONAL,
        IN PVOID  ApcRoutine  OPTIONAL,
        IN PVOID  ApcContext  OPTIONAL,
        OUT PVOID  IoStatusBlock,
        OUT PVOID  Buffer,
        IN ULONG  Length,
        IN PLARGE_INTEGER  ByteOffset  OPTIONAL,
        IN PULONG  Key  OPTIONAL    );

typedef struct _UNICODE_STRING {
        USHORT Length, MaximumLength;
        PWCH Buffer;
} UNICODE_STRING, *PUNICODE_STRING;

typedef struct _OBJECT_ATTRIBUTES {
    ULONG Length;
    HANDLE RootDirectory;
    PUNICODE_STRING ObjectName;
    ULONG Attributes;
    PVOID SecurityDescriptor;        // Points to type SECURITY_DESCRIPTOR
    PVOID SecurityQualityOfService;  // Points to type SECURITY_QUALITY_OF_SERVICE
} OBJECT_ATTRIBUTES;

#define InitializeObjectAttributes( p, n, a, r, s ) { \
    (p)->Length = sizeof( OBJECT_ATTRIBUTES );          \
    (p)->RootDirectory = r;                             \
    (p)->Attributes = a;                                \
    (p)->ObjectName = n;                                \
    (p)->SecurityDescriptor = s;                        \
    (p)->SecurityQualityOfService = NULL;               \
    }

#define OBJ_CASE_INSENSITIVE                                    0×00000040L
#define FILE_NON_DIRECTORY_FILE                 0×00000040
#define FILE_OPEN_BY_FILE_ID                    0×00002000
#define FILE_OPEN                                                               0×00000001

int main(int argc, char* argv[])
{
        HANDLE d = CreateFile(L"\\\\.\\c:", GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ | FILE_SHARE_WRITE, 0, OPEN_EXISTING, 0, 0  );
        BY_HANDLE_FILE_INFORMATION i;
        HANDLE f = CreateFile(L"c:\\bla.bla", GENERIC_WRITE, 0, NULL, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
        ULONG bla;
        WriteFile(f, "helloworld", 11, &bla, NULL);
        printf("%x, %d\n", f, GetLastError());
        GetFileInformationByHandle(f, &i);
        printf("id:%08x-%08x\n", i.nFileIndexHigh, i.nFileIndexLow);
        CloseHandle(f);

        pNtCreateFile NtCreatefile = (pNtCreateFile)GetProcAddress(GetModuleHandle(L"ntdll.dll"), "NtCreateFile");
        pNtReadFile NtReadFile = (pNtReadFile)GetProcAddress(GetModuleHandle(L"ntdll.dll"), "NtReadFile");

        ULONG fid[2] = {i.nFileIndexLow, i.nFileIndexHigh};
        UNICODE_STRING fidstr = {8, 8, (PWSTR) fid};

        OBJECT_ATTRIBUTES oa = {0};
    InitializeObjectAttributes (&oa, &fidstr, OBJ_CASE_INSENSITIVE, d, NULL);

    ULONG iosb[2];
    ULONG status = NtCreatefile(&f, GENERIC_ALL, &oa, iosb, NULL, FILE_ATTRIBUTE_NORMAL, FILE_SHARE_READ | FILE_SHARE_WRITE, FILE_OPEN, FILE_OPEN_BY_FILE_ID | FILE_NON_DIRECTORY_FILE, NULL, 0);
        printf("status: %X, handle: %x\n", status, f);
        UCHAR buf[11] = {0};
        LONG Off[2] = {0};
        status = NtReadFile(f, NULL, NULL, NULL, (PVOID)&iosb, (PVOID)buf, sizeof(buf), (PLARGE_INTEGER)&Off, NULL);
        printf("status: %X, bytes: %d\n", status, iosb[1]);
        printf("buf: %s\n", buf);
        CloseHandle(f);
        CloseHandle(d);
}