{"id":74,"date":"2008-12-29T13:43:20","date_gmt":"2008-12-29T15:43:20","guid":{"rendered":"http:\/\/www.ragestorm.net\/blogs\/?p=74"},"modified":"2008-12-29T13:44:24","modified_gmt":"2008-12-29T15:44:24","slug":"to-memset-or-not","status":"publish","type":"post","link":"https:\/\/www.ragestorm.net\/blogs\/?p=74","title":{"rendered":"String Initialization is Tricksy"},"content":{"rendered":"<p>A friend of mine had to hand in an assignment for Computer Science in university. As I understood, it was a relatively easy assignment. And the point is that that friend is very experienced programmer and knows a thing or two about it. Anyway, he had a line in the code which goes like this:<br \/>\nchar buf[1024] = &#8220;abc&#8221;;<\/p>\n<p>You don&#8217;t even need to know C in order to understand that line, right? I assume we all agree to that. It simply initializes the buffer with a constant string literal. So his lecturer asked him, what does this line do <em>precisely<\/em>. And to his surprise his answer was incorrect. The correct answer is that the <strong>whole<\/strong> buffer is initialized and then the string constant is copied (this can be done in a few\u00a0ways, for example copying a buffer with the zeros at\u00a0the end of it).\u00a0So today another friend called me on the phone to ask about this thing, why our first\u00a0friend was wrong about it. Now as a reverser, I suppose I need to know the answer to such a simple matter as well. But, the sad part was that I was wrong as well as the two of them. I fired up the C standard and started to search for the solution. I wanted al iving proof to the matter at hand. Looking here and there it took me around 15 mins to lie my hands on the piece of sentence that settled all that matter down. And I quote:<\/p>\n<p>&#8220;If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, <u>or fewer characters in a string literal used to initialize an array of known size<\/u> than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.&#8221;<\/p>\n<p>The underlined text is the answer &#8211;\u00a0If there are less characters than the size of the array to initialize, the <strong>remainder<\/strong> has to be initialized as well. There is another clause which explains how the initialization is being done, but for now, let it be &#8216;zeroing&#8217;.<\/p>\n<p>Now the reason I was wrong about it is because I happened to see many (for example):<br \/>\nmov [buf+1], &#8216;a&#8217;<br \/>\nmov [buf+2], &#8216;b&#8217;<br \/>\nmov [buf+3], &#8216;c&#8217;<br \/>\nmov [buf+4], &#8216;\\0&#8217;<\/p>\n<p>in lots of functions, and that means the source C code is:<\/p>\n<p>char buf[] = &#8220;abc&#8221;;<\/p>\n<p>The standard says about this case that the size of the buffer is to be acquired from the size of the literal constant string, don&#8217;t forget the null termination character as well. So that&#8217;s why I didn&#8217;t see the memset coming in to initialize the all buffer. Besides, maybe most of the people code it this way:<br \/>\nchar buf[1024];<br \/>\nstrcpy(buf, &#8220;abc&#8221;);<\/p>\n<p>Which doesn&#8217;t lead to a memset or other way of initialization of the rest of the array.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A friend of mine had to hand in an assignment for Computer Science in university. As I understood, it was a relatively easy assignment. And the point is that that friend is very experienced programmer and knows a thing or two about it. Anyway, he had a line in the code which goes like this: [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","jetpack_publicize_message":""},"categories":[5,20],"tags":[],"jetpack_featured_media_url":"","jetpack_publicize_connections":[],"jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/pbWKd-1c","_links":{"self":[{"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/posts\/74"}],"collection":[{"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=74"}],"version-history":[{"count":0,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=\/wp\/v2\/posts\/74\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=74"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=74"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ragestorm.net\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=74"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}