{"id":4,"date":"2007-06-02T06:35:51","date_gmt":"2007-06-02T10:35:51","guid":{"rendered":"http:\/\/www.ragestorm.net\/blogs\/?p=4"},"modified":"2007-06-02T06:47:01","modified_gmt":"2007-06-02T10:47:01","slug":"recursive-data-structures","status":"publish","type":"post","link":"https:\/\/www.ragestorm.net\/blogs\/?p=4","title":{"rendered":"Recursive Data-Structures"},"content":{"rendered":"<p>For the sake of simplicity a linked list structure is a recursive one, it points to itself.<\/p>\n<p>struct Node {<br \/>\n\u00a0int data;<\/p>\n<p>&#8230;<\/p>\n<p>&#8230;<br \/>\n\u00a0Node* next;<br \/>\n};<\/p>\n<p>This case is easy to follow because no matter what, you always end up with accessing a Node structure.<\/p>\n<p>The problem arises when you have\u00a0recursive data-structures and terminals. I treat a terminal as a data structure which stops the recursion of the recursive data-structure. Unlike the above linked list structure, which stops when the &#8216;next&#8217; pointer is NULL, the &#8216;next&#8217; in the complex data-structure might point to either itself or a terminal.<\/p>\n<p>\u00a0struct TerminalNode {<\/p>\n<p>\u00a0int data;<\/p>\n<p>};<\/p>\n<p>\u00a0struct Node {<\/p>\n<p>\u00a0int data;<\/p>\n<p>\u00a0Node* next;<\/p>\n<p>};<\/p>\n<p>Now you might find yourselves asking why would one make the &#8216;next&#8217; point to a TerminalNode, but that&#8217;s a common data-structure layout. For example, an expression tree can represent arithmetic operations in a recursive way, but the values <em>are<\/em> the terminals &#8211; Add(5, Sub(3, 2)).<\/p>\n<p>In a structured layout, it looks something like this (where children is an array of pointers to &#8216;next&#8217; nodes):<\/p>\n<p>0: ExpressionNode: (type) Add, (children) 1, 2<\/p>\n<p>1: ExpressionNode: (type) Value 5, (children) NULL<\/p>\n<p>2: ExpressionNode: (type) Sub, (children) 3, 4<\/p>\n<p>3: ExpressionNode: (type) Value 3, (children) NULL<\/p>\n<p>4: ExpressionNode: (type) Value 2, (children) NULL<\/p>\n<p>So you can see that in this case an ExpressionNode of &#8216;Value&#8217; type is a TerminalNode whereas the ExpressionNode is a recursive one.<\/p>\n<p>This type of data-structure is not rare: To be honest, I have this situation in both <a target=\"_blank\" href=\"http:\/\/ragestorm.net\/distorm\/\" title=\"diStorm64\">diStorm64<\/a>\u00a0and diStorm3. In diStorm64 the TerminalNode is actually the Instruction-Info structure which describes the instruction, its mnemonic, its operand types, etc&#8230; and in diStorm3, it&#8217;s the above ExpressionNode thingy, but much more complex, of course&#8230;<\/p>\n<p>Everything is good and nice, but finally I can get to my point in this topic &#8211; How do you distinguish between a recursive node and a terminal node? That is, how do you know where the &#8220;tree&#8221; terminals?<\/p>\n<p>So there are many solutions to this one, I chose to cover two of them. The simplest one is to change your structures by adding another flag which states, &#8220;I&#8217;m a terminal&#8221; or &#8220;I&#8217;m a recursive node&#8221;. But then, it means you have to pay at least another extra byte for both structures. And then examining the flag only\u00a0<em>after <\/em>you accessed its structure. Not mentioning, it takes more memory space.<\/p>\n<p>The other solution, which seems to be more common, is to set the LSB (least-significant-bit) in the\u00a0&#8216;next&#8217; pointer, thus, if the bit is set, you know you reached a terminal node, even before you accessed it. And seeing the LSB is cleared, you know you continue normally with a recursive node. This is also good for caching, because the structures are compact.<\/p>\n<p>But what happens, if you have 4 types of terminal structures? Well, in a 32 bits environment all structures are (usually\/99%) aligned to a 4 bytes boundary, so it means you can easily use the two LSBs, which represent 4 different terminals&#8230;<\/p>\n<p>So accessing a single terminal structure is as simple as:<\/p>\n<p>if (pNode-&gt;next &amp; 1) {<\/p>\n<p>\u00a0pTerminal = (TerminalNode*)(pNode &amp; -1);<\/p>\n<p>\u00a0\/\/ Use pTerminal here&#8230;<\/p>\n<p>} else {<\/p>\n<p>\u00a0pNode &amp;= -1;<\/p>\n<p>\u00a0\/\/ Continue with another node&#8230;<\/p>\n<p>}<\/p>\n<p>You should have noticed by now\u00a0that a terminal structure doesn&#8217;t have a &#8216;next&#8217; pointer. So a NULL is out of question for terminals, but it&#8217;s still OK for recursive nodes, which means &#8220;dead-end&#8221;, rather than a terminal to mark something bad happened&#8230;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For the sake of simplicity a linked list structure is a recursive one, it points to itself. struct Node { \u00a0int data; &#8230; &#8230; \u00a0Node* next; }; This case is easy to follow because no matter what, you always end up with accessing a Node structure. The problem arises when you have\u00a0recursive data-structures and terminals. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","jetpack_publicize_message":""},"categories":[3,16],"tags":[],"jetpack_featured_media_url":"","jetpack_publicize_connections":[],"jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/pbWKd-4","_links":{"self":[{"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/posts\/4"}],"collection":[{"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4"}],"version-history":[{"count":0,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/posts\/4\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}