Thursday, April 21, 2011

Practical challenges for attribute grammars

I've had occasion recently to work on a compiler implemented with an attribute grammar framework. Such frameworks let you declare attributes on AST nodes that are computed in terms of other attributes. In general, it's a powerful and convenient way to compute information about an AST, and I expect they make a lot of sense in tools that only analyze ASTs and don't generate or rewrite them. They save you from working out a specific computation order, because the tool can look at which attributes refer to which other ones and compute them all on demand.

For compilers, however, I no longer think attribute grammars work well. Let me describe three fundamental problems with attribute grammars for use in compilers.

  1. Attributes can be inherited, meaning they look up in the AST to for information. Examples are the expected type of an expression, the enclosing module for any node, and the set of free variables available in some context. These computations cannot be computed unless the node in question really is attached to an AST. However, compilers create new AST nodes all the time, and while those new nodes are being set up, they can't possibly be attached yet. Thus, unattached nodes are inconsistent with inherited attributes.

  2. Compilers change the AST. They desugar syntax into lower-level syntax, they optimize, they reorder parts of the program, and so on. Every time the AST changes, some cached attribute values become invalid. At that point, how do you flush the caches that are now invalid? To some extent you can get by by flushing all caches after every compiler phase. However, this means that later parts of a phase will see stale data compared to earlier parts. This situation results in bugs where, to fix them, you have to reason about the exact order that the traversal happens and the relationship of the stale data to the current version of the AST.

  3. Attributes have unpredictable performance characteristics. When you see a reference to an attribute, it might be cached and take constant time. On the other hand, it might trigger a chain of attribute computations that chain across the whole program. It's much harder to convince yourself that a particular AST traversal will operate in linear time, because you don't know how expensive all the attribute computations are.

These are all potentially addressable, but they are serious challenges that any practical tool will need to face. As things stand, I haven't seen an attribute grammar framework that is there yet. They don't really achieve their main benefit, which is to save you from reasoning about order of evaluation of and dependencies between different attributes. You end up reasoning about them anyway to get the bugs out and the performance up.

Instead of storing attributes on the AST nodes and computing them with an attribute grammars framework, a more traditional approach is as follows. Store a few, carefully selected bits of information on nodes, such as the types of expressions. These things must be carefully maintained across all AST rewrites, and the compiler will be buggy if you get it wrong, so don't store too much this way. For most information, have each compiler phase compute and maintain whatever information it needs. This results in some amount of computation, but you can usually find a way to calculate whatever you need in linear time. Since any one phase will usually require at least linear time, anyway, this is sufficient for getting the compiler reasonably performant before you do your first profile run.

No comments: