A question for every C programmer out there. What does the following C program [with a few post-increment(++) and bitwise OR(|) operators thrown in to confuse you] output ?

 What will this C program output ?

I thought iIndex will be 5 and iValue will be 0x01020304. After compiling in Visual Studio 2005, the program output iIndex as 5 but iValue as 0x01010101.  Confounded, I compiled with Visual Studio 2008 to check whether there was any difference. Nope nothing changed. So what is going on here ?

Operator precedence wise post increment operator (++) is evaluated before left shift operator (<<) so why would the increments happen after the bitwise OR even though ORing is outside parantheses ?

I searched around in ISO C standard document but could not find anything that would explain the behavior. (In a 550 page document, one can never be sure of finding the legalese that would apply to a specific case.)

I thought I could at least see what Microsoft compiler was doing if I changed the pre increments to post increments. (++iIndex instead of iIndex++). And sure enough iValue changed to 0x05050505. After playing around with various combinations of ++ and --, it was clear that pre increment/decrement operators were evaluated first to update iIndex. Then the bitwise shifts and ORs were performed on that value of iIndex to result in iValue. This is followed by updating iIndex with post increment/decrement operators specified.

Try working the output of the following -

 What will this C program output ?

Since I do not have a linux installation handy to try things out I have no idea how gcc evaluates these. Until I find the relevant part of the C standard (or lack thereof), this will be a rather disturbing hole in my understanding of C expressions...

 Update [May 26 2008] I found following two possibly related paragraphs in C spec -

a) From section 6.5 on Expressions subsection 2

 Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored. (71)...

71)This paragraph renders undefined statement expressions such as

i = ++i + 1;

a[i++] = i;

while allowing

i = i + 1;

a[i] = i;

b) From section 6.5.14 Logical OR operator subsection 3

Unlike the bitwise | operator, the || operator guarantees left-to-right evaluation; there is a sequence point after the evaluation of the first operand. If...

The first paragraph clearly renders the expression in question undefined. The first paragraph may also explain why at most one increment happened during evaluation of above expression with the rest of the increments happening afterwards. I say may because the C specification does not define sequence points clearly enough for me to be absolutely sure.

The second paragraph makes the expression even more problematic to evaluate. If left to right evaluation is not guaranteed, which is something I was not aware of, and each of the sub-expressions (between |s) has side effect on the others (because iIndex is changed in each), there is no way to determine a priori what the expression will evaluate to (because the order of evaluation is not known).

In short, the above expression is undefined.

Tagged with →  
Share →

4 Responses to Look, Ma, I do not know C

  1. andrewl says:

    Nice puzzle! I’ve had the model in my head “prefix operator takes effect immediately, postfix operator takes effect after the semicolon” and it hasn’t failed me yet.

    BTW what is the pound in the “%#x”? I’ve always thought it was required to manually type out the number of hex digits desired, like “%08X” for a 32-bit type.

  2. Satya Das says:

    # adds a 0x before outputting the hex digits.

  3. greg says:

    This is expected, actually. If you look a bit closer at the ISO C standard (google suggests you can find a copy here: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf), it does explain why you’re seeing what you’re seeing.

    The operator precedence that you’re referring to has to do with grouping in the abstract syntax tree. For instance, *p++ will be evaluated as *(p++). However, if you look at section 6.5.2.4 (“Postfix increment and decrement operators”), it states “The side effect of updating the stored value of the operand shall occur between the previous and the next sequence point.”

    And if you wander over to Appendix C (“Sequence points”), you’ll see that in the code you wrote above, iIndex will not be updated until after the entire initialization expression has been executed.

    Compare that with the following:

    i = 1;
    if ((i++ > 1) || (i++ > 1)) { printf(“i was greater than one.\n”); }

    And if you want to have some fun, see if you can guess what will happen when you mark iIndex as volatile… 😉

  4. Satya Das says:

    @greg
    Thanks for stopping by. The values output by the program do not seem to have any effect if the integer iIndex is declared volatile in the expressions mentioned in the post.

    I suspect you saw |(=bit wise OR) in the expression as || (logical OR) ?

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Looking for something?

Use the form below to search the site:


Still not finding what you're looking for? Drop us a note so we can take care of it!

Visit our friends!

A few highly recommended friends...

Set your Twitter account name in your settings to use the TwitterBar Section.