Showing posts with label abstraction. Show all posts
Showing posts with label abstraction. Show all posts

Tuesday, January 22, 2013

Virtual classes

Gilad Bracha has a great post up on virtual classes:
I wanted to share a nice example of class hierarchy inheritance....All we need then, is a slight change to ThreadSubject so it knows how to filter out the synthetic frames from the list of frames. One might be able to engineer this in a more conventional setting by subclassing ThreadSubject and relying on dependency injection to weave the new subclass into the existing framework - assuming we had the foresight and stamina to use a DI framework in the first place.

I looked into virtual classes in the past, as part of my work at Google to support web app developers. Bruce Johnson put out the call to support problems like Gilad describes above, and a lot of us thought hard on it. Just replace "ThreadSubject" by some bit of browser arcana such as "WorkerThread". You want it to work one way on App Engine, and a different way on Internet Explorer, and you want to allow people to subclass your base class on each platform.

Nowadays I'd call the problem one of "product lines", having had the benefit of talking it over with Kurt Stirewalt. It turns out that software engineering and programming languages have something to do with each other. In the PL world, thinking about "product lines" leads you to coloring, my vote for one of the most overlooked ideas in PL design.

Here is my reply on Gilad's blog:

I'd strengthen your comment about type checking, Gilad: if you try to type check virtual classes, you end up wanting to make the virtual classes very restrictive, thus losing much of the benefit. Virtual classes and type checking are in considerable tension.

Also agreed about the overemphasis on type checking in PL research. Conceptual analysis matters, but it's hard to do, and it's even harder for a paper committee to review it.

I last looked into virtual classes as part of GWT and JS' (the efforts tended to go in tandem). Allow me to add to the motivation you provide. A real problem faced by Google engineers is to develop code bases that run on multiple platforms (web browsers, App engine, Windows machines) and share most of the code. The challenge is to figure out how to swap out the non-shared code on the appropriate platform. While you can use factories and interfaces in Java, it is conceptually cleaner if you can replace classes rather than subclass them. More prosaically, this comes up all the time in regression testing; how many times have we all written an interface and a factory just so that we could stub something out for unit testing?

I found type checking virtual classes to be problematic, despite having delved into a fair amount of prior work on the subject. From what I recall, you end up wanting to have *class override* as a distinct concept from *subclassing*, and for override to be much more restrictive. Unlike with subclassing, you can't refine the type signature of a method from the class being overridden. In fact, even *adding* a new method is tricky; you have to be very careful about method dispatch for it to work.

To see where the challenges come from, imagine class Node having both an override and a subclass. Let's call these classes Node, Node', and LocalizedNode, respectively. Think about what virtual classes mean: at run time, Node' should, in the right circumstances, completely replace class Node. That "replacement" includes replacing the base class of LocalizedNode!

That much is already unsettling. In OO type checking, you must verify that a subclass conforms to its superclass. How do you do this if you can't see the real superclass?

To complete the trap, imagine Node has a method "name" that returns a String. Node' overrides this and--against my rules--returns type AsciiString, because its names only have 7-bit characters in them. LocalizedNode, meanwhile, overrides the name method to look up names in a translation dictionary, so it's very much using Unicode strings. Now imagine calling "name" on a variable of static type Node'. Statically, you expect to get an AsciiString back. However, at run time, this variable might hold a LocalizedNode, in which case you'll get a String. Boom.

Given all this, if you want type checking, then virtual classes are in the research frontier. One reasonable response is to ditch type checking and write code the way you like. Another approach is to explore alternatives to virtual classes. One possible alternative is to look into "coloring", as in Colored FJ.

Wednesday, March 28, 2012

Shapiro on compiling away abstraction

Via Lambda the Ultimate, I see that Jonathan Shapiro has a rambling retrospective on BitC and why he thinks it has gotten into a dead end.

One of the several themes is that the following combination of design constraints cause trouble:
  • He wants good performance, comparable to C++.
  • He wants a better set of abstraction facilities than C++.
  • He wants separate compilation to do most of the work, like in C++, rather than have the runtime do most of the real compilation, as in Java.
It's hard to excerpt, but here's him explaining the way this all works in C++:
In C++, the "+" operator can be overloaded. But (1) the bindings for primitive types cannot be replaced, (2) we know, statically, what the bindings and representations *are* for the other types, and (3) we can control, by means of inlining, which of those operations entail a procedure call at run time. I'm not trying to suggest that we want to be forced to control that manually. The key point is that the compiler has enough visibility into the implementation of the operation that it is possible to inline the primitive operators (and many others) at static compile time.
To contrast, BitC has trouble due to its extra level of abstraction:
In BitC, *both* of these things *are* abstracted at static compile time. It isn't until link time that all of the representations are in hand.

He goes on to consider the implications of different points in the design space. One point he brings up is that there is another stage of compilation that can be helpful to exploit: install time. Instead of compile time, run time, or even the link time for an application, you can get a lot of leverage if you apply compilation techniques at the point that a collection of applications and libraries are installed onto a system.

Web toolkits are a different domain than Shapiro is thinking about, but they face this particular question as well. You can greatly improve web applications if the tools do some work before all the source code gets to the web browser in front of the user. Without tools, if you just hack JavaScript files by hand and post them on a static HTTP server, the web browser ends up lazily linking the program, which means the application takes longer to start up. Good toolkits do a lot of work before the code makes it down to the end user, and in particular they really go to down at link time. At link time, the entire program is available, so it's possible to divide the program content--both programmatic code and media resources--into reasonably sized bundles of downloadable content.