Tuesday, January 22, 2013

Virtual classes

Gilad Bracha has a great post up on virtual classes:
I wanted to share a nice example of class hierarchy inheritance....All we need then, is a slight change to ThreadSubject so it knows how to filter out the synthetic frames from the list of frames. One might be able to engineer this in a more conventional setting by subclassing ThreadSubject and relying on dependency injection to weave the new subclass into the existing framework - assuming we had the foresight and stamina to use a DI framework in the first place.

I looked into virtual classes in the past, as part of my work at Google to support web app developers. Bruce Johnson put out the call to support problems like Gilad describes above, and a lot of us thought hard on it. Just replace "ThreadSubject" by some bit of browser arcana such as "WorkerThread". You want it to work one way on App Engine, and a different way on Internet Explorer, and you want to allow people to subclass your base class on each platform.

Nowadays I'd call the problem one of "product lines", having had the benefit of talking it over with Kurt Stirewalt. It turns out that software engineering and programming languages have something to do with each other. In the PL world, thinking about "product lines" leads you to coloring, my vote for one of the most overlooked ideas in PL design.

Here is my reply on Gilad's blog:

I'd strengthen your comment about type checking, Gilad: if you try to type check virtual classes, you end up wanting to make the virtual classes very restrictive, thus losing much of the benefit. Virtual classes and type checking are in considerable tension.

Also agreed about the overemphasis on type checking in PL research. Conceptual analysis matters, but it's hard to do, and it's even harder for a paper committee to review it.

I last looked into virtual classes as part of GWT and JS' (the efforts tended to go in tandem). Allow me to add to the motivation you provide. A real problem faced by Google engineers is to develop code bases that run on multiple platforms (web browsers, App engine, Windows machines) and share most of the code. The challenge is to figure out how to swap out the non-shared code on the appropriate platform. While you can use factories and interfaces in Java, it is conceptually cleaner if you can replace classes rather than subclass them. More prosaically, this comes up all the time in regression testing; how many times have we all written an interface and a factory just so that we could stub something out for unit testing?

I found type checking virtual classes to be problematic, despite having delved into a fair amount of prior work on the subject. From what I recall, you end up wanting to have *class override* as a distinct concept from *subclassing*, and for override to be much more restrictive. Unlike with subclassing, you can't refine the type signature of a method from the class being overridden. In fact, even *adding* a new method is tricky; you have to be very careful about method dispatch for it to work.

To see where the challenges come from, imagine class Node having both an override and a subclass. Let's call these classes Node, Node', and LocalizedNode, respectively. Think about what virtual classes mean: at run time, Node' should, in the right circumstances, completely replace class Node. That "replacement" includes replacing the base class of LocalizedNode!

That much is already unsettling. In OO type checking, you must verify that a subclass conforms to its superclass. How do you do this if you can't see the real superclass?

To complete the trap, imagine Node has a method "name" that returns a String. Node' overrides this and--against my rules--returns type AsciiString, because its names only have 7-bit characters in them. LocalizedNode, meanwhile, overrides the name method to look up names in a translation dictionary, so it's very much using Unicode strings. Now imagine calling "name" on a variable of static type Node'. Statically, you expect to get an AsciiString back. However, at run time, this variable might hold a LocalizedNode, in which case you'll get a String. Boom.

Given all this, if you want type checking, then virtual classes are in the research frontier. One reasonable response is to ditch type checking and write code the way you like. Another approach is to explore alternatives to virtual classes. One possible alternative is to look into "coloring", as in Colored FJ.

1 comment:

Gilad Bracha said...

cross posting my response here as well:

Hi Lex,

Great comments. Having spent too much of my life on types, I decided I would not allow them to constrain Newspeak's design or usage.

I have though a bit about how to typecheck it though. One possibility is, like Dart, to abandon soundness and treat typechecking as a helpful heuristic.

Another tack is to do concrete type inference on the ready to deploy application, where we can pin down all the classes involved. You'd know more about how to do that than me.

Your examples of coding problems are also spot on. Newspeak handles all these well - not only (o reven mainly) using class hierarchy inheritance. The fact that all modules are first class and parametric, and that the platform itself is a first class object, eliminate the need for DI frameworks and make sure you never get stuck.