Wednesday, November 24, 2010

Floating-point computation: if you don't care what you compute

There's an old saw among programmers that "I can make this program run arbitrarily fast, so long as you don't care what it produces". When you put it like that, it's obviously a bad idea: you always care about the output, even for a random number generator. In practice, this question comes up all the time, just not stated as starkly. Nowhere is this situation more common than with floating-point computations.

William Kahan has written a windy tour of error accumulation for floating-point and posted it online. The content is much like a blog, but he's publishing it as an ever-growing PDF file. Here are a few of the several interesting take-aways:
  • There's no "mindless" way to remove floating point error from your program. It takes some work.
  • Using high-precision floating-point is remarkably effective in practice. Take whatever size floating-point number you think you need, double the number of bits, and then compute over those.
  • Running your program in different rounding modes is remarkably effective at detecting whether your program is unstable. If rounding at the 53rd bit of precision makes any difference at all in your program's output, then your implementation is so unstable that you are probably generating bad results.

Thinking on these problems, I have a new-found love of interval arithmetic. As Kahan discusses in detail, interval arithmetic gives you a proven bound on how large your error can be. As he also discusses, though, if you mindlessly implement a mathematical formula exactly the way it is written, your intervals tend to be outrageously large. It is common, for example, for the proven error bars to double after every computation.

My question is why mainstream languages don't support interval arithmetic? If they did, then programmers could include asserts in their code about the size of the error bars at each step of a computation. Initially, these assertions would all fail, due to the intervals being way too large. However, with some cleverness and patience, programmers could get the intervals much smaller. Over time, programmers would build up a toolbox of techniques that apply in most situations. It would fit right in with how software engineering is performed nowadays.

Programming this way sounds like it would take longer to implement numerics than what I am used to. What's the alternative though? Graphics programs aside, don't we want to implement correct output, not just output that looks good? Imagine if a tech-savvy client asked how accurate our program's output was? We'd have to say, "I don't really know".

1 comment:

Stephen Lewis said...

Although it doesn't count as a mainstream language, I love that Frink supports interval arithmetic. (And it's a pretty awesome tool for those 'back of the envelope' calcutions.)