Saturday, March 23, 2013

C compilers exploiting undefined behavior

It's getting out of hand the way C compilers exploit undefined behavior. I see via John Regehr's blog that there is a SPEC benchmark being turned into a noop via an undefined-behavior argument.

This isn't what the spec writers had in mind when they added undefined behavior. To fix it, Regehr's idea of having extra checkers to find such problems is a plausible one, though it will take a dedicated effort to get there.

An easier thing to do would be for gcc and Clang to stop the madness! If they see an undefined behavior bullet in their control-flow graphs, then they should leave it there, rather than assuming it won't happen and reasoning backward. This will cause some optimizations to stop working, but really, C compilers were already plenty good 10 years ago. The extra level of optimizations is not a net win for developers. Developers want speed, sure, but above all they want their programs to do what they look like they do.

It should also be possible to improve the spec around this, to pin down what undefined behavior means a little more specifically. For example, left-shifting into the sign bit of a signed integer is undefined behavior. That's way underspecified. The only real options are: shift into the sign bit as expected, turn the integer into unpredictable garbage, or throw an exception. As things stand, a C compiler is allowed to observe a bad left shift and then turn your whole program into a noop.