Monday, August 6, 2012

Deprecation as product lines

I would like to draw a connection between two lines of research: deprecation, and product lines. The punchline is that my personal view on deprecation could be explained by reference to product lines: deprecation is a product line with just two products. To see how that connection works, first take a look at what each of these terms means.

A product line is a collection of products built from a single shared pool of source code. Some examples of a product line would be:

  • The Android, iPhone, Windows, and Macintosh versions of an application.
  • The English, Chinese, and Lojban versions of an application.
  • The trial, normal, and professional versions of an application.
  • The embedded-Java and full-Java versions of a Java library.

There is a rich literature on product lines; an example I am familiar with is the work on CFJ (Colored Featherweight Java). CFJ is Java extended with "color" annotations. You "color" your classes, methods, and fields depending on which product line each part of the program belongs to. A static checker verifies that the colors are consistent with each other, e.g. that the mobile version of your code does not invoke a method that is only present on the desktop version. A build-time tool can build individual products in the product line by extracting just the code that goes with a designated color. To my knowledge, CFJ has not been explicitly used outside of the CIDE tool it was developed with, and CIDE itself does not appear to be widely used. Instead, the widely used tools for product lines don't have a good theoretical grounding.

Deprecation, meanwhile, is the annotation of code that is going away. As with CFJ, deprecation tools are very widely used but not well grounded theoretically. With deprecation, programmers mark chunks of code as deprecated, and a compile time checker emits warnings whenever non-deprecated code accesses deprecated code. I have previously shown that the deprecation checker in Oracle javac has holes; there are cases where removing the deprecated code results in a a program that either does not type check or that does not behave the same.

As much as I enjoyed working on a specific theoretical framework for deprecation, I must now admit that it's really a special case of CFJ. For the simpler version of deprecation checking, choose two colors, non-deprecated and everything, and mark everything with the "everything" color. You then have two products in the product line: one where you leave everything as is, and one where you keep only the non-deprecated code.

There is a lot of potential future work in this area; for this post I just wanted to draw the connection. I believe CFJ would benefit from explicitly claiming that the colored subsets of the program have the same behavior as the full program; I believe it has this property, and I went to the trouble of proving it holds for deprecation checking. Also, I believe there is fruitful work in studying the kinds of colors that are available. With deprecation, there is usually no point in time where you can remove all deprecated code in the entire code base. You want to have a number of colors for the deprecated code, for example different colors for different future versions of the software.