{"id":28,"date":"2007-10-08T17:28:13","date_gmt":"2007-10-08T21:28:13","guid":{"rendered":"http:\/\/www.ragestorm.net\/blogs\/?p=28"},"modified":"2007-10-08T17:28:13","modified_gmt":"2007-10-08T21:28:13","slug":"x86-assemblyyy","status":"publish","type":"post","link":"https:\/\/www.ragestorm.net\/blogs\/?p=28","title":{"rendered":"X86 Assemblyyy"},"content":{"rendered":"<p>Complex instructions are really useful, especially if you try to optimize the size of your code. Of course, modern processors nowadays are becoming RISC&#8217;ish more and more. But as for X86 its backward compatibility makes those instruction to stay there (forever?) ready for you to use. The funny thing is that in the modern X86 processors the RISC instructions are probably faster, so compiler don&#8217;t generate code with the CISC instructions. Thing is, that when you size-optimizing your code, or writing a shell code, you don&#8217;t care much about speed at all. So why not take advantage of those instructions?<\/p>\n<p>The most popular X86 CISC instruction is LOOP. It&#8217;s a simple one as well, decrements the genereal purpose register CX(\/ECX\/RCX) by one\u00a0and jumps to some address if it&#8217;s not zero. So you have something like 3 sub-instructions in one. Or call it micro-opcodes. Such as:\u00a0a decrement, an if statement (cmp)\u00a0and\u00a0a branch.<\/p>\n<p>\u00a0So speaking of LOOP, there are also LOOPZ and LOOPNZ, those instruction in addition to branching upon <em>r<\/em>CX not being zero, will also branch if the Zero flag is set or not. Which means that you &#8220;earn&#8221; another condition testing for free. For instance, if you wanted to do some test on each entry in an array and then continue to next entry only if the previous was successful <em>and<\/em> there are still cells to scan, those instruction might be helpful.<\/p>\n<p>\u00a0I have never seen anyone uses those instructions, even not in code crunching. I think it&#8217;s because most people just don&#8217;t read the specs, and even so, they don&#8217;t know how to use those instructions. Not that they are hard to use, but maybe a bit confusing or not popular.<\/p>\n<p>I found somewhat a useless combination of the repeat prefix with the LODS instruction. A REP LODSB, means: read into AL the byte at address DS:rSI and advance rSI (by examining DF&#8230;). So you end up with some code that gets into AL the last byte of the buffer that rSI was pointing to&#8230;(Of course it depends on the initial value of rCX).\u00a0I think that in the 8086 this repeat and lods combination was prohibited. So while I was\u00a0working on\u00a0diStorm, I made it so if a\u00a0LODS instruction is prefixed with a REP, that REP prefix is being ignored. Then I got some angry email\u00a0that today it&#8217;s not the case and this combo is supported&#8230; I even checked the current specs and it seems that that guy was right. So honestly, I&#8217;m not sure it&#8217;s useful for anything&#8230; but it&#8217;s cool to note it.<\/p>\n<p>Another instruction I wanted to talk about is SCAS. I guess you know this instruction in the strlen implementation as follows:<\/p>\n<pre>sub ecx, ecx\r\nsub al, al\r\nnot ecx\r\ncld\r\nrepne scasb\r\nnot ecx\r\ndec ecx<\/pre>\n<p>Now, I&#8217;m not sure whether this is the fastest way to implement an strlen, some compilers use this implementation and other have find-a-zero-byte-inside-a-dword trick. Though maybe I should talk about those tricks in another post someday&#8230;<\/p>\n<p>Anyway, back to SCASB, so now that we saw how strlen is implemented, we know that with the REPNE prefix, which means continue as long as rCX is not zero <em>and<\/em> as long as the Zero flag is zero as well; we test for two conditions in one instruction. In the code above the REPNE prefix tests ZF, but the truth is that the SCAS instruction <strong>updates<\/strong> all <em>other<\/em> flags. So think of the SCAS instruction as a compare instruction between the Accumulator register (AL\/AX\/EAX\/RAX) and the source memory&#8230;For example you can do SCAS and then JS (jump on sign)&#8230;<\/p>\n<p>There are many other forsaken instructions, that are not fully used, so next time when you fire your assembler, take a look at the specs again, maybe you will find something better. Well, if you have more ideas of the like, you are welcome to send a comment.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Complex instructions are really useful, especially if you try to optimize the size of your code. Of course, modern processors nowadays are becoming RISC&#8217;ish more and more. But as for X86 its backward compatibility makes those instruction to stay there (forever?) ready for you to use. The funny thing is that in the modern X86 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","jetpack_publicize_message":""},"categories":[5],"tags":[],"jetpack_featured_media_url":"","jetpack_publicize_connections":[],"jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/pbWKd-s","_links":{"self":[{"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/posts\/28"}],"collection":[{"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=28"}],"version-history":[{"count":0,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/posts\/28\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=28"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=28"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=28"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}