Sunday, July 8, 2012

Evan Farrer Converts Code from Python to Haskell

Evan Farrer has an interesting post up where he converts some code from Python to Haskell. Kudos to Farrer for empirically studying a language design question. Here is his summary:
The results of this experiment indicate that unit testing is not an adequate replacement for static typing for defect detection. While unit testing does catch many errors it is difficult to construct unit tests that will detect the kinds of defects that would be programatically detected by static typing. The application of static type checking to many programs written in dynamically typed programming languages would catch many defects that were not detected with unit testing, and would not require significant redesign of the programs.

I feel better about delivering code in a statically typed language if the code is more than a few thousand lines long. However, my feeling here is not due to the additional error checking you get in a statically typed language. Contra Farrer's analysis, I feel that this additional benefit is so small as to not be a major factor. For me, the advantages are in better code navigation and in locking developers down to using relatively boring solutions. Both of these lead to code that will stay more robust as it undergoes maintenance.

As such, the most interesting piece of evidence Farrer raises is that the four bodies of code he converted were straightforward to rewrite in Haskell. We can conclude, for these four small programs, that the dynamic features of Python were not important for expressiveness.

On the down side, Farrer's main conclusion is as much undermined by his evidence as supported. His main claim is that Haskell's type checker provides substantial additional error checking compared to what you get in Python. My objection is that all programs have bugs, and doing any sort of study of code is going to turn up some of them. The question is in the significance of those bugs. On this criterion the bugs Farrer finds do not look very important.

The disconnect is that practicing programmers don't count bugs by number. The attribute they care about is the overall bugginess of the software. Overall bugginess can be quantified in different ways; one way to do it is to consider the amount of time lost by end users due to bugs in the software. Based on this metric, a bug that loses a day's work for the end user is supremely important, more important than any feature. On the other hand, a bug that merely causes a visual artifact, and not very often, would be highly unimportant.

The bugs Farrer reports mostly have to do with misuse of the software. The API is called in an inappropriate way, or an input file is provided that is bad. In other words, the "bugs" have to do with the software misbehaving if its preconditions are not met, and the "fix" is to update the software to throw an explicit error message rather than to progress some distance before yielding a walk back on a dynamic type error.

At this point in the static versus dynamic face off, I would summarize the score board as follows:

  • You can write industry-standard code in either style of language.
  • Static typing does not automatically yield non-buggy software. Netscape Navigator is a shining example in my mind. It's very buggy yet it's written in C++.
  • Static languages win, by quite a lot, for navigating code statically.
  • It's unclear which language gives the more productive debugging experience, but both are quite good with today's tools.
  • Testing appears to be adequate for finding the bulk of the significant errors that a type checker would find.
  • Static languages run faster.
  • Dynamic languages have consistently fast edit-run cycles; static languages at best tie with dynamic languages, and they are much worse if your development setup is off the beaten path.
  • Expressiveness does not align well with staticness. To name a few examples, C is more expressive that BASIC, Python is better than C, and Scala is better than Python.

Monday, July 2, 2012

Saving a file in a web application

I recently did an exploration of how files can be saved in a web application. My specific use case is to save a table of numbers to an Excel-friedly CSV file. The problem applies any time you want to save a file to the user's local machine, however.

There are several Stack Overflow entries on this question, for example Question 2897619. However, none of them have the information organized in a careful, readable way, and I spent more than a day scouting out the tradeoffs of the different available options. Here is what I found.

Data URLs and the download attribute

Data URLs are nowadays supported by every major browser. The first strategy I tried is to stuff the file's data into a data URL, put that URL as the href of an anchor tag, and set the download attribute on the anchor.

Unfortunately, multiple problems ensue. The worst of these is that Firefox simply doesn't support the download attribute; see Issue 676619 for a depressingly sluggish discussion of what strikes me as a simple feature to implement. Exacerbating the problem is Firefox Issue 475008. It would be tolerable to use a randomly generated filename if at least the extension were correct. However, Firefox always chooses .part at the time of this writing.

Overall, this technique is Chrome-specific at the time of writing.

File Writer API

The File Writer API is a carefully designed API put together under the W3C processes. It takes account of the browser security model [sic], for example by disallowing file access except those verified by the user by using a native file picker dialog.

This API is too good to be true. Some web searching suggests that only Chrome supports or even intends to support it; not even Safari is marked as planning to support it, despite the API being implemented in Webkit and not in Chrome-specific code. I verified that the API is not present in whatever random version of Firefox is currently distributed with Ubuntu.

The one thing I will say in its favor is that if you are going to be Chrome-specific anyway, this is a clean way to do it.

ExecCommand

For completeness, let me mention that Internet Explorer also has an API that can be used to save files. You can use ExecCommand with SaveAs as an argument. I don't know much about this solution and did not explore it very far, because LogicBlox web applications have always, so far, needed to be able to run in non-Microsoft browsers.

For possible amusement, I found that this approach doesn't even reliably work on IE. According to a Stack Overflow post I found, on certain common versions of Windows, you can only use this approach if the file you are saving is a text file.

Flash

Often when you can't solve a problem with pure HTML and JavaScript, you can solve it with Flash. Saving files is no exception. Witness the Downloadify Flash application, which is apparently portable to all major web browsers. Essentially, you embed a small Flash application in an otherwise HTML+JavaScript page, and you use the Flash application to do the file save.

I experimented with Downloadify's approach with some custom ActionScript, and there is an unfortunate limitation to the whole approach: there is a widely implemented web browser security restriction that a file save can only be initiated in response to a click. That alone is not a problem by itself in my context, but there's a compounding problem: web browsers do not effectively keep track of whether they are in a mouse-click context if you cross the JavaScript-Flash membrane.

Given these restrictions, the only way I see to make it work is to make the button the user clicks on be a Flash button rather than an HTML button, which is what Downloadify does. That's fine for many applications, but it opens up severe styling issues. The normal way to embed a Flash object in a web page involves using a fixed pixel size for the width and height of the button; for that to work, it implies that the button's face will be a PNG file rather than nicely formatted text using the user's preferred font. It seems like too high of a price to pay for any team trying to write a clean HTML+JavaScript web application.

Use an echo server

The most portable solution I am aware of is to set up an echo server and use a form submission against that server. It is the only non-Flash solution I found for Firefox.

In more detail, the approach is to set up an HTML form, stuff the data to be saved into a hidden field of the form, and submit the form. Have your echo server respond with whatever data the client passed to it, and have it set the Content-Disposition HTTP header to indicate that the data should be saved to a file. Here is a typical HTTP header that can be used:

Content-Disposition: attachment; filename=export.csv

This technique is very portable; later versions of Netscape would probably be new enough. On the down side, it requires significant latency to upload the content to the server and then back down again.