The results of this experiment indicate that unit testing is not an adequate replacement for static typing for defect detection. While unit testing does catch many errors it is difficult to construct unit tests that will detect the kinds of defects that would be programatically detected by static typing. The application of static type checking to many programs written in dynamically typed programming languages would catch many defects that were not detected with unit testing, and would not require significant redesign of the programs.
I feel better about delivering code in a statically typed language if the code is more than a few thousand lines long. However, my feeling here is not due to the additional error checking you get in a statically typed language. Contra Farrer's analysis, I feel that this additional benefit is so small as to not be a major factor. For me, the advantages are in better code navigation and in locking developers down to using relatively boring solutions. Both of these lead to code that will stay more robust as it undergoes maintenance.
As such, the most interesting piece of evidence Farrer raises is that the four bodies of code he converted were straightforward to rewrite in Haskell. We can conclude, for these four small programs, that the dynamic features of Python were not important for expressiveness.
On the down side, Farrer's main conclusion is as much undermined by his evidence as supported. His main claim is that Haskell's type checker provides substantial additional error checking compared to what you get in Python. My objection is that all programs have bugs, and doing any sort of study of code is going to turn up some of them. The question is in the significance of those bugs. On this criterion the bugs Farrer finds do not look very important.
The disconnect is that practicing programmers don't count bugs by number. The attribute they care about is the overall bugginess of the software. Overall bugginess can be quantified in different ways; one way to do it is to consider the amount of time lost by end users due to bugs in the software. Based on this metric, a bug that loses a day's work for the end user is supremely important, more important than any feature. On the other hand, a bug that merely causes a visual artifact, and not very often, would be highly unimportant.
The bugs Farrer reports mostly have to do with misuse of the software. The API is called in an inappropriate way, or an input file is provided that is bad. In other words, the "bugs" have to do with the software misbehaving if its preconditions are not met, and the "fix" is to update the software to throw an explicit error message rather than to progress some distance before yielding a walk back on a dynamic type error.
At this point in the static versus dynamic face off, I would summarize the score board as follows:
- You can write industry-standard code in either style of language.
- Static typing does not automatically yield non-buggy software. Netscape Navigator is a shining example in my mind. It's very buggy yet it's written in C++.
- Static languages win, by quite a lot, for navigating code statically.
- It's unclear which language gives the more productive debugging experience, but both are quite good with today's tools.
- Testing appears to be adequate for finding the bulk of the significant errors that a type checker would find.
- Static languages run faster.
- Dynamic languages have consistently fast edit-run cycles; static languages at best tie with dynamic languages, and they are much worse if your development setup is off the beaten path.
- Expressiveness does not align well with staticness. To name a few examples, C is more expressive that BASIC, Python is better than C, and Scala is better than Python.
For me the problem with dynamical languages is that it is much more difficult to understand a program that was written by other person. Firstly the type annotations give you additional information of what you can expect in given method and what is location the method that being invoked. Many times I need to debug program in order to be understand what is going on. I also agree that dynamically typed languages gives you much more ways to express the same thing with less explicit information in code. But to be honest in statically typed languages with complex inheritance chain I often straggle to understand what is the actual subclass was passed as a method argument. And when type system becomes much more complex it also becomes easier to debug the code in order to understand what is going on. This is really evident with scala (especially with scala implicit when I have very weird type errors (this is actually worse as you need to debug the compiler :-) which not very fun thing to do)).
Lex, I think that the following comment may be a bit misleading:
"My objection is that all programs have bugs, and doing any sort of study of code is going to turn up some of them."
While this statement is definitely true, I want to emphasize that I only included the bugs that were detected by the Haskell type checker. By studying the code I found a log more bugs than were listed in my paper but because they were not caught by the type checker I didn't include them.
Concerning the importance of the bugs, I don't think I have a really good way to measure their importance. It's quite possible that some of them could cause a program to misbehave. I suspect if the bug is causing your website to be inoperable it's a critical bug, and if it's not then it's trival :)
A fair point, Evan. Would that more people were so meticulously objective in gathering scientific data.
It's the analysis that appears to be so slippery.
Post a Comment