Wednesday, September 30, 2009

DLL hell as a job description

I've had a hard time explaining my views on package distribution. Let me try and wrap it up this way: DLL hell is a job description. Dealing with it is a work that has to be done, one large enough to consume full-time work. When DLL hell is really problematic is when that work falls on end users who aren't even software experts.

Break that down. First of all, I don't see how the root problem can be avoided. Large software is built modularly. The software is broken into components that are maintained and advanced by independent teams. No matter how hard they try, and no matter what advanced component system they use, those teams will always introduce unexpected incompatibilities between different versions of different components. I am highly sympathetic to WCOP's goal of controlling external dependencies, though I see it as a goal that can only be partially fulfilled.

Second, there is a role to play in finding versions of components that are compatible with each other. My favorite approach is that taken by Linux distributions, where within one distribution there is only one version of each component. Less centralized approaches are also plausible, but they have severe difficulties. An example would be Eclipse Plugins. However, in that example, it mostly appears to work well only when the functionality added is minor and the plugins being added don't interact with each other very much.

With all of that foundation, the DLL hell of Microsoft Windows is much easier to talk about: End users are playing the same role as those making a Linux distribution. Even computer specialists will resent this undesired work, and most users of Windows are not computer specialists. The only good way I see to fix this is to shift the distribution-making work onto some other group. That's challenging with Windows software being sold per copy, but perhaps it can be made to work with enough red tape. Alternatively, perhaps Windows could move over to a subscription model where the application bits can be freely copied. If the bits could be freely copied, then Windows distributions could sprout just as easily as Linux distributions have.

Wednesday, September 23, 2009

Public access to publicly funded research

Bravo to those supporting the Federal Research Public Access Act. If it becomes law, then publications resulting from publicly funded research would have to be made available to the general public.

The specific law makes sense as a part of accountability for public funds. If public funds are spent on research, then the public can rightfully demand that it sees the results of that research.

Additionally, it's just good for the progress of knowledge. We progress faster when any random person can take part in the scholarly debate taking part in journals. Currently, anyone interested has to either pay the dues or physically trek over to a library that has.

In addition to this act passing, it would be nice if the Association of Computing Machinery stopped hiding its sponsored conferences' publications behind a pay wall. The ACM is supposed to support the advancement of computing knowledge, not tax it.

Monday, September 21, 2009

Exclusively live code

Dead-for-now code splitting has the compiler divide up the program into smaller fragments of code that can be downloaded individually. How should the compiler take advantage of this ability? How should it define the chunks of code it divides up?

Mainly what GWT does is carve out code that is exclusively live once a particular split point in the program has been activated. Imagine a program with three split points A, B, and C. Some of the code in the program is needed initially, some is needed when only A has been activated, some is needed when only B has been activated, and some is needed once both are activated, etc. Formally, these are written as L({}), L({A}), L({B}), and L({A,B}). The "L" stands for "live", the opposite of "dead". The entire program is equivalent to L({A,B,C}), because GWT strips out any code that is not live under any assumptions.

The code exclusively live to A would be L({A,B,C})-L({B,C}). This is the code that is only needed once A has been activated. Such code is not live is B has been activated. It's not live when C has been activated. It's not live when both B and C together are activated. Because such code is not needed until A is activated, it's perfectly safe to delay loading this code until A is reached. That's just how the GWT code splitter works: it finds code exclusive to each split point and only loads that code once that split point is activated.

That's not the full story, though. Some code isn't exclusive to any fragment. Such code is all put into a leftovers fragment that must be loaded before any of the exclusive fragments. Dealing with the leftovers fragment is a significant issue for achieving good performance with a code-split program.

That's the gist of the approach: exclusively live fragments as the core approach plus a leftover fragment to hold odds and ends. It's natural to ask why, and indeed I'm far from positive it's ideal. It would be more intuitive, at least to me, to focus on positively live code such as L({A}) and L({B,C}). The challenges so far are to come up with such a scheme that generalizes to lots of split points. It's easy to dream up strategies that cause horrible compile time or bad web-browser caching behavior or both, problems that the current scheme doesn't have. The splitting strategy in GWT is viable, but there might well be better ones.

Wednesday, September 9, 2009

Microsoft sticks to their guns on synchronous code loading

Microsoft's Doloto has been reannounced, and it sounds like they are planning to stick to synchronous code loading:
Profiling information is used to calculate code coverage and a clustering strategy. This determines which functions are stubbed out and which are not and groups functions into batches which are downloaded together, called clusters.

I tentatively believe this approach can produce a somewhat reasonable user experience. It has some unfortunate problems, though, due to unpredictable functions being stubbed out. Any call to a stubbed out function, whichever ones the compiler chooses, will result in a blocking network operation to download more code. Whenever this happens:

  • Unrelated parts of the application are also paused, not just the parts that need the missing code.
  • There is no way for the programmer to sensibly deal with a download failure. The entire application has to halt.
  • There is no way to give a status update to the user indicating that a download is in progress.


These problems are fundamental. In practice, matters are even worse. On many browsers, a synchronous network download will lock up the entire browser, including tabs other than the one that issued the request. Locking an entire browser does not make for a good user experience. It does not make people feel like they are in good hands when they visit a web site.

GWT avoids these problems by insisting that the application never blocks on a code download. The request of a code download returns immediately. If a network download is required, then its success or failure is indicated with an asynchronous callback. Until that callback is triggered, the rest of the application keeps running.