Lex Spoon: language design

Showing posts with label language design. Show all posts

Sunday, June 4, 2023

Why the M4 macro syntax goes so wrong

I've been hacking in M4 recently, and I ran across a great article about M4 by Michael Breen. I highly recommend it for anyone either using M4 (e.g. for Autotools or for Bison), or considering M4 for a new project (probably a bad idea). This observation caught my eye:

While m4's textual rescanning approach is conceptually elegant, it can be confusing in practice and demands careful attention to layers of nested quotes.

Christman Brown writes something similar, in his review of M4:

It is difficult to debug. I quickly found my default behavior when encountering something unexpected is to just increase the escape quotes around something. That often fixed it.

What's going on, here? A macro language is supposed to let you write normal text and then sprinkle some macro expansions here and there. It's supposed to save you from the tedium of dealing with a full-fledged general purpose programming language. Also, sometimes this strategy works out. With the C preprocessor, you write ordinary C code most of the time, and it works totally fine to occassionally call a macro. Why does this approach work in C but not in M4?

I think Michael Breen is onto something. Macros are a form of subroutine, and with a well designed syntax for subroutine call, you want it to be straightforward and thoughtless to invoke a subroutine and pass it some arguments. You want the arguments to feel just like text in any other part of your system. Think how it feels to write HTML and to put a div around part of your page. You don't have to do any special encoding to the stuff you put inside the div; you can, any time you want, take any chunk of your code with balanced tags and put that inside another pair of tags. With M4, the basic facility of a subroutine call, which you use all over the place, is somehow tricky to use.

M4 is not wrong to have a quoting mechanism, but where it goes wrong is to require quoting on the majority of subroutine calls. Here's what that looks like in practice. M4 uses function-call syntax to invoke macros, so a call looks like foo(a, b, c). That's a great syntax to try, because function calls are a ubiquitious syntax that users will recognize, but it has a problem for a text macro language in that the comma is already a common character to use in the a, b, and c arguments. Having observed that, M4 could and should have moved away from the function call notation and looked for something else. Instead, the designers stuck with the function calls and augmented it with an additional kind of syntax, quotation. In practice, you usually quote all of your arguments when you pass them to an M4 macro, like this: foo([a], [b], [c]). Only usually, however, and you have to think about it every time. The quotation and the macro call are two different kinds of syntax, and the user has to control them individually.

The reason it works out better for C's preprocessor is that the C language already has a way to quote any commas that occur in the arguments, and the preprocessor understands those quoting mechanisms. For example, with sqrt(foo(x, y)), the preprocessor understands that the comma inside the (x, y) part should not count as separating the parameters to sqrt. Programmers already write those parentheses without thinking about it, because the function-call notation for foo(x, y) already requires parentheses. Unfortunately, C does not do the right thing for an example like swap(a[i,j], a[i,j+1]), because it does not treat square brackets the way it treats parenthesis. It could and it should, however. None of this maps over to M4 very well, because usually the arguments to an M4 macro are not code, and so the author isn't going to naturally escape any commas that occur. The function-call syntax just doesn't work well for the situation M4 is intended to be used in.

Fixing the local problem

If we wanted to write a next-generation M4, here are some observations to start from:

It is better if the quoting syntax is built into the subroutine call syntax. That way, users don't have to independently reason about both calls and quotes, and instead can just think about call-and-quote as a single thing that they do.
Look to markup languages for inspiration, for example XML or Latex. The general feel we are going for is that you mostly write text, and then occasionally you sprinkle in some macro calls. That's a markup language!

Based on these, a good thing to try for an M4-like system would be to use XML tags for macro invocation. XML is a culmination of a line of tag-based markup languages starting with SGML and HTML, and it is generally state of the art for that kind of language. Among other advantages, XML is a small, minimal language that you can learn quickly, and it has explicit syntax for self-closing tags rather than some tags being self-closing and others not, in a context-dependent way depending on the schema that is currently in effect for a given file.

Latex's macro syntax is also very interesting, and it has a big advantage in usually saying each tag name just once (\section{foo}) rather than twice (<section>foo</section>). However, my experience with Latex is that I am in constant doubt how much lookahead text a macro invocation will really look at; the braces-based syntax is just a convention, and you never know for sure which macros really look at those conventions or not. That said, the general syntax looks like a promising idea to me if it were locked down a little more rather than being based on the Tex macro language. A similar approach was used in Scribe, a markup language designed by Brian Reid in the 70s.

What to use for now

As things stand, I don't think M4 really has a sweet spot. Old projects that want to have an ongoing roadmap should probably move away from M4. New projects should never use it to begin with. What are the options right now, without having to build a new textual macro language?

It's not a bad option to use a general-purpose language like Python or Java. If you follow the links from the PP generic preprocessor that is used in Pandoc, they tell you they are replacing their templating by more and more usage of Lua, a general purpose language. When you use a general-purpose language to generate text, you can use the normal library routines your language already supports, plus a mature library of lists, maps, structs, and iteration routines on top of them. An example of this direction is the Jooq library for generating SQL code.

Another strong approach is to use XML, possibly augmented by XSLT. An example would be the query help format of the GitHub Code Scanner, a format that I designed many years ago at a startup called Semmle. We had an existing syntax based on HTML with regex-based rewrites applied to the HTML file, and we had a problem that people were typo-ing the syntax without realizing it, resulting in help files that were sometimes unreadable and were often missing a large portion of the text. I explored a few options for getting us a more rigorous format with tool support for common errors, and I landed on XML, which I feel like worked out pretty well. In addition to the format itself being nice to work with, we got to tap into the existing XML ecosystem, for example to use Eclipse's excellent XML editor.

I briefly explored JSON as well, which is another minimal syntax that is easy to learn, but I quickly realized why they call XML a markup language. Unlike with JSON, XML lets you mostly write normal text, and then as a secondary thing, add special syntax--hence, "marking up" your text. XML is also a very mature system in general, so for example we could configure Eclipse (which was a viable tool back then!) to auto-complete tags and give you errors within the editor if you used tags that aren't allowed. If I were to rewrite how Bison's skeletons work, I think something based on XML would be tempting to try. Bison already uses this approach for its debug output. I'm not sure, though; XSLT looks pretty voluminous in practice.

Some of the best options are embedded in an existing general-purpose language. JSX is embedded in JavaScript, and Scribe is embedded in Scheme. I'm not sure how practical these are if you aren't already working in those environments, but if you are, look for one that works with your current source language.

The larger lesson

An effective tool has to be evaluated in the context it will be used in. Both C and M4 use function-call notation for macro invocation, but in C it works well, while with M4 it becomes a nightmare. Achieving an effective tool, therefore, requires design thinking. You need to learn about the situation you are targeting, you need to adjust your solution based on that, and above all you need to be ready to critique your solutions and iterate on improvements. The critique can take many forms, and a really important one is to watch how your users are doing and to really reflect on why it's happening and how it could be different.

Surprises can happen anywhere, and you'll support your users more if you can act on those surprises and try something different.

Tuesday, April 12, 2016

Two little things I wish Java would add

When geeking out about language design, it's tempting to focus on the things that require learning something new to even understand how it works. SAM types require understanding target typing, and type members require understanding path-dependent types. Fun stuff.

Aside from these things that are fun to talk about over beers, I really wish Java would pick up a few things from Scala that are just plain more convenient.

Multi-line string literals

A great way to structure a unit test is to feed in a chunk of text, run some processing that you want to verify, convert the actual output to text, and then compare it against another chunk of text that's included in the test case. Compared to a dense string of assertEquals calls, this testing pattern tends to be much easier to read and understand at a glance. When such a test fails, you can read a text diff at a glance and possibly see multiple different kinds of failure that happened with the test, rather than stare into the thicket of assertEquals calls and try to deduce what is being tested by the particular one that failed.

The biggest weakness of this style is very mundane: it's hard to encode a multi-line chunk of text in Java. You have to choose between putting the text in an external file, or suffering through strings that have a lot of "\n" escapes in them. Both choices have problems, although the latter option could be mitigated with a little bit of IDE support.

In Scala, Python, and many other languages, you can write a multi-line string by opening it with triple quotes (""") rather than a single quote mark ("). It's a trivial feature that adds a lot to the day to day convenience of using the language.

As one trick to be aware of, it's important to help people out with indentation when using triple quotes. In Scala, I lobbied for the stripMargin approach to dealing with indentation, where you put a pipe on each continuation line, and anything up to the pipe is considered leading indentation and removed. In retrospect, I wish I had pushed for that to simply be the default behavior. If you need to insert a literal continuation character, you can always write it twice. Making people write stripMargin on almost every multi-line string is a form of boilerplate.

Case classes

There are philosophers who disagree, but I find them a little too philosophical for my taste. Sometimes you really want to write a class that has no hidden internal state. Sometimes it would be a breach of the API to retain any internal state, or to implement the public API as anything other than plain old final fields. Some motivating examples are: tiny types, data structure nodes such as links in a linked list, and data-transfer objects.
In such a case, it takes a tremendous amount of code in Java to implement all the odds and ends you would really like for such a class. You would really like all of the following, and they are all completely mechanical:

Constructors that copy their parameters to a series of final fields.
A toString() implementation.
Comparison operations: equals(), hashCode(), and compareTo(). Ideally also helpers such as isLessThan().
Copy constructors that make a new version by replacing just one of the fields with a new value.

The equals() method is particularly painful in Java because there is a lot of advice going around about how to write them that is not consistent. I've been drug into multi-day debates on equals() methods where people cite things I published in the past to try and use against me; I'm pretty sure I meant what I said then and mean what I say now. Above all, though, I'd rather just have a reasonable equals() method and not spend time talking about it.

Thursday, November 5, 2015

The mystics are coming out

Discussion on the Scala collections revamp is starting to get mystical. It really bugs me: good language design makes a huge difference, but it's hard to see unless you actually deal with thousands or more developers on millions or more lines of code. Casual observers of language design discussions don't see it themselves, so they don't realize what these problems look like to the other people in the discussion. So they think everyone is just goofing around and throwing out random ideas just because they are possible.

I started to follow up on the issue itself, but I fear making the problem worse. So I'll post a couple of rebuttals here. I don't really think the people actually involved in the redesign will be distracted by the mystics, anyway. They are like gold-sellers in MMOs or beggars in San Francisco; unless you are specifically trying to engage with them, you just learn to tune them out. Well, okay, they are like really nerdy gold sellers who like to talk about higher math. Okay I better just drop this attempt at an analogy.

First, Python's indexing operator has a lot of practical problems. Here are a few concrete examples: https://plus.google.com/+MattMight/posts/BVSmNadKni4 . Thinking backward from those puzzlers, I think the root of the problem is that the [a:b] slicing operator means something different depending on whether a and b are positive, and whether a is smaller or larger than b. This is foundational syntax used everywhere, and if you want code to be readable, people need to know whether it's doing a forward or reverse slice without having to do mental data flow analysis on the parameters. Java avoids this trap, and Scala should, too. The sublist operation should only work on non-negative arguments, and only when the first argument is smaller than the second.

The other thing I will say is that exceptions, null, and -1 are also all fine, when used by a tasteful library designer. We tried at Semmle to get people using more Options, and we found that it harmed our reputation as a quality lint tool. I can only speak publicly about open-source code, but to give an example, Apache Spark has over a thousand occurrences where they use null but, with only local changes, they could use Option instead. It's too many. It means that the basic premise of the programming advice has some sort of problem.

As one stab at it--though it's really a huge topic--you have to think about what you want the callers to do to defend against a missing value. If Scala's basic get/apply methods starts returning an Option, then people will just litter their code with calls to .get, so the net result is that the code is more bloated but otherwise behaves exactly the same. Even in pure math, people will write notation like f'(x) as the derivative of f, but you know, derivative isn't always defined. So should smart mathematicians instead write get(f')(x)? Or (f') match { ... }?

That's my try, but you don't even have to understand this if you are willing to ape mature libraries in areas that they are working okay. It's not a practical problem in Java that the various .get() methods return null or throw exceptions; even if you say it's not perfect, it's certainly not bad. It is, however, a very big problem that Java collections are overly mutable. For example, see this Stack Overflow question: https://stackoverflow.com/questions/2842169/why-are-public-static-final-array-a-security-hole. Scala will be more attractive to more developers if it focuses on these widely recognized pain points. Which is why the mystics drive me crazy--if they get their way they will find that their playground is gradually becoming a ghost town, but only after it's too late.

Tuesday, June 3, 2014

My analysis of the Swift language

Apple has put out Swift, which sounds like a nice language overall. Here is my flagrantly non-humble opinion about how its features line up with what I consider modern, well-established aspects of programming language design.

The good

First off, Swift includes range-checked integer arithmetic! Unless you explicitly ask for wraparound, any overflow will cause an exception. I was just commenting yesterday on what a tough problem this is for current programming languages.

It has function types, nested functions, and closures, and it has numerous forms of optimized syntax for closures. This is all heartening, and I hope it will stick going forward, much the way lexical variable binding has stuck. Closures are one of those things that are both helpful and have small down side once your language has garbage collection.

Swift's closures can assign to variables in an outer scope. That's the right way to do things, and I find it painful how much Java's designers struggle with this issue. As a technical detail, I am unclear what happens if a closure captures a variable but does not modify it. What ought to happen is that any read from it will see the latest value, not the value at the time the capture happened. However, the Closures section of the Language Guide suggests that the compiler will capture just the initial value in this case. I believe this is misguided and will cause as many traps as it fixes; for example, suppose the programmer captured a counter, but does not increment that counter itself? The motto here should be: you don't know what the programmer meant, but you know what they wrote.

Type inference is quite welcome. I don't know what more to say than that developers will take advantage of it all the time, especially for local variables.

Tuple types are a small touch that comes up in practical programming all the time. How many times have you wanted to return two values from a function, and had to design a class for it or otherwise to pervert your design?

Enumerations seem good to include in the language. Language designers often seem to think that enums are already handled by other language features, and therefore should not be included. I respect that, but in this case, it's a simple feature that programmers really like to use. Java's enums are baroque, and none of the several Datalog dialects I have wokred on include enums at all. I miss having language support for a closed set of named integers. It's easy to support and will be extremely popular.

As an interesting trick, keyword arguments to functions are supported, but you have to opt in. That's probably a good combination. Keyword arguments are quite useful in cases where you have a lot of parameters, and sometimes this legitimately happens. However, it's unfortunate if you afflict all functions with keyword arguments, because the keyword arguments become part of the API. By making it opt in, the feature is there for the functions which can use it.

Including both structs and classes looks initially redundant, but it's quite helpful to have a value type that encompasses multiple other values. As an example, the boxed Integer type on Java would be much better as a struct than as a class.

Extensions look valuable for programming in the large. They allow you can make an existing class fit into a new framework, and they let you add convenience methods to an existing class. Scala uses its implicit conversions for extensions, but direct support for extensions also makes a lot of sense.

The way option chaining works is a nice improvement on Objective C. In Objective C, any access to nil returns nil. In practice, programmers are likely better off with getting an error when they access nil, as a form of design by contract: when something goes wrong, you want the program to stop at that point, not some indefinite time later. Still, sometimes you want nil propagation, and when you do, Swift lets you just put a "?" after the access.

Weak references are helpful for any language with automatic memory management, but they look especially helpful in a language with reference-counting memory management. I don't follow why there are also the "unowned" references, except that the designers didn't want your code to get polluted with ! dereferences. Even so, I would think this is a case of do or do not do. If you are worried about ! pollution, which is a legitimate concern, then simply don't require the !.

As an aside, this is the reason I am not sure pervasive null is as bad as often claimed. In practical code, there are a lot of cases where a value is sometimes optional but, in a specific context, is known to be present. In such a case, you are just going to deference it, and possibly suffer a null-pointer check if you were wrong. As such, programmers are guided into a style where they just insert dereferences until the compiler shuts up, which makes the code noisey without increasing practical reliability.

The dubious

Swift looks very practical and workable, but there are some issues I think could have been done better.

Single inheritance seems like a step backward. The linearization style of multiple inheritance has proven helpful in practice, and it eliminates the need for a separate "interface" or "protocol" feature. Perhaps designers feel like C++'s multiple inheritance went badly, and are avoiding multiple inheritance like the plague? I used to think that way, but it's been multiple decades since C++'s core design. There are better design for multiple inheritance nowadays.

Swift doesn't appear to include persistent data structures. This is the one feature I miss the most when I don't get to program in Scala, and I don't know why it isn't catching on more widely in other languages. Developers can add their own collection types, but since the new types aren't standard, you end up having to convert to standard types whenever you call into another library.

The automatic immutability of collections assigned to constants looks driven by the lack of persistent collections. It's better to support both features independently: let variables be either mutable or not, and let collections be mutable or not. All four combinations are very useful.

Deinitialization, also known as finalization, looks like a throwback to me. In a system with automatic memory management, you don't want to know precisely when your memory is going to get freed. As such, you can't count on deinitializers running soon enough to be useful. Thus, you always need a fallback plan of deallocating things manually. Once you deallocate manually, though, deinitializers become just a debugging technique. It's better to debug leaks using a tool than with a language feature.

In-out parameters seem like a step backwards. The trouble is that most functions use only in parameters, so when you see a function call, a programmer's default assumption is that the callee will not modify the argument. It can lead to bad surprises if the parameter gets modified at all. Out parameters are so unusual that it's better to be explicit about them, for example by taking a mutable collection as an argument.

Custom precedence (and associativity) is likely to go badly. We discussed this in detail, over the course of days, for X10, because X10 is a scientific language that really benefits from a rich set of operators. One problem with user-defined precedence is that it's hard to scope: you want to scope the operators themselves, not their implementations, because parsing happens before method lookup. It's also tough on programmers if they have to learn a new precedence table for every file of code they read. All in all, we concluded that Scala had a reasonable set of trade offs here: have a built-in precedence table with a huge number of available operators, and make library designers simply choose from the existing operators.

I see no exceptions, which is likely to be a nuisance to programmers if they are truly missing. Sometimes you want to tear down a whole chunk of computation without exiting the whole program. In such cases, exceptions work very well. Maybe I just missed it.

Integer types are hard to get right, and I am not sure Swift has chosen a great approach. It's best to avoid unsigned types, and instead to have untyped operations that can apply to typed integers. It's also best to avoid having low-precision operations, even if you have low-precision storage. Given all of the above, you don't really need explicit conversions any more. Java's integer design is quite good, with the exception of the unnecessary char type that is not even good for holding characters. I suspect many people overlook this about Java, because it's a case where programmers are better off with a language with fewer features.

Sunday, June 16, 2013

When to best use type inference

Type inference can make code much better. It can save you from writing down something that is completely obvious, and thus a total waste of space to write down. For example, type inference is helpful in the following code:

    // Type inference
    val date = new Date

    // No type inference
    val date: Date = new Date

It's even better for generics, where the version without type inference is often absurd:

    // Type inference
    val lengths: List[Int] =
        names.map(n => n.length).filter(l => l >= 0)

    // No type inference
    val lengths: List[Int] =
        names.map[Int, List[Int]]((n: String) => n.length).
        filter((l: Int) => l >= 0)

When would a type not be "obvious"? Let me describe two scenarios.

First, there is obvious to the reader. If the reader cannot tell what a type is, then help them out and write it down. Good code is not an exercise in swapping puzzles with your coworkers.

    // Is it a string or a file name?
    val logFile = settings.logFile

    // Better
    val logFile: File = settings.logFile

Second, there is obvious to the writer. Consider the following example:

    val output =
        if (writable(settings.out))
            settings.out
        else
            "/dev/null"

To a reader, this code is obviously producing a string. How about to the writer? If you wrote this code, would you be sure that you wrote it correctly? I claim no. If you are honest, you aren't sure what settings.out is unless you go look it up. As such, you should write it this way, in which case you might discover an error in your code:

    val output: String =
        if (writable(settings.out))
            settings.out  // ERROR: expected String, got a File
        else
            "/dev/null"

Languages with subtyping all have this limitation. The compiler can tell you when an actual type fails to satisfy the requirements of an expected type. However, if you ask it whether two types can ever be used in the same context as each other, it will always say yes, they could be used as type Any. ML and Haskell programmers are cackling as they read this.

It's not just if expressions, either. Another place this issue crops up is in collection literals. Unless you tell the compiler what kind of collection you are trying to make, it will never fail to find a type for it. Consider this example:

    val path = List(
        "/etc/scpaths",
        "/usr/local/sc/etc/paths",
        settings.paths)

Are you sure that settings.paths is a string and not a file? Are you sure nobody will change that type in the future and then see what type check errors they get? If you aren't sure, you should write down the type you are trying for:

    val path = List[String](
        "/etc/scpaths",
        "/usr/local/sc/etc/paths",
        settings.paths)  // ERROR: expected String, got a File

Type inference is a wonderful thing, but it shouldn't be used to create mysteries and puzzles. In code, just like in prose, strive to say the interesting and to elide the obvious.

Saturday, March 23, 2013

C compilers exploiting undefined behavior

It's getting out of hand the way C compilers exploit undefined behavior. I see via John Regehr's blog that there is a SPEC benchmark being turned into a noop via an undefined-behavior argument.

This isn't what the spec writers had in mind when they added undefined behavior. To fix it, Regehr's idea of having extra checkers to find such problems is a plausible one, though it will take a dedicated effort to get there.

An easier thing to do would be for gcc and Clang to stop the madness! If they see an undefined behavior bullet in their control-flow graphs, then they should leave it there, rather than assuming it won't happen and reasoning backward. This will cause some optimizations to stop working, but really, C compilers were already plenty good 10 years ago. The extra level of optimizations is not a net win for developers. Developers want speed, sure, but above all they want their programs to do what they look like they do.

It should also be possible to improve the spec around this, to pin down what undefined behavior means a little more specifically. For example, left-shifting into the sign bit of a signed integer is undefined behavior. That's way underspecified. The only real options are: shift into the sign bit as expected, turn the integer into unpredictable garbage, or throw an exception. As things stand, a C compiler is allowed to observe a bad left shift and then turn your whole program into a noop.

Tuesday, January 22, 2013

Virtual classes

Gilad Bracha has a great post up on virtual classes:

I wanted to share a nice example of class hierarchy inheritance....All we need then, is a slight change to ThreadSubject so it knows how to filter out the synthetic frames from the list of frames. One might be able to engineer this in a more conventional setting by subclassing ThreadSubject and relying on dependency injection to weave the new subclass into the existing framework - assuming we had the foresight and stamina to use a DI framework in the first place.

I looked into virtual classes in the past, as part of my work at Google to support web app developers. Bruce Johnson put out the call to support problems like Gilad describes above, and a lot of us thought hard on it. Just replace "ThreadSubject" by some bit of browser arcana such as "WorkerThread". You want it to work one way on App Engine, and a different way on Internet Explorer, and you want to allow people to subclass your base class on each platform.

Nowadays I'd call the problem one of "product lines", having had the benefit of talking it over with Kurt Stirewalt. It turns out that software engineering and programming languages have something to do with each other. In the PL world, thinking about "product lines" leads you to coloring, my vote for one of the most overlooked ideas in PL design.

Here is my reply on Gilad's blog:

I'd strengthen your comment about type checking, Gilad: if you try to type check virtual classes, you end up wanting to make the virtual classes very restrictive, thus losing much of the benefit. Virtual classes and type checking are in considerable tension.
Also agreed about the overemphasis on type checking in PL research. Conceptual analysis matters, but it's hard to do, and it's even harder for a paper committee to review it.
I last looked into virtual classes as part of GWT and JS' (the efforts tended to go in tandem). Allow me to add to the motivation you provide. A real problem faced by Google engineers is to develop code bases that run on multiple platforms (web browsers, App engine, Windows machines) and share most of the code. The challenge is to figure out how to swap out the non-shared code on the appropriate platform. While you can use factories and interfaces in Java, it is conceptually cleaner if you can replace classes rather than subclass them. More prosaically, this comes up all the time in regression testing; how many times have we all written an interface and a factory just so that we could stub something out for unit testing?
I found type checking virtual classes to be problematic, despite having delved into a fair amount of prior work on the subject. From what I recall, you end up wanting to have *class override* as a distinct concept from *subclassing*, and for override to be much more restrictive. Unlike with subclassing, you can't refine the type signature of a method from the class being overridden. In fact, even *adding* a new method is tricky; you have to be very careful about method dispatch for it to work.
To see where the challenges come from, imagine class Node having both an override and a subclass. Let's call these classes Node, Node', and LocalizedNode, respectively. Think about what virtual classes mean: at run time, Node' should, in the right circumstances, completely replace class Node. That "replacement" includes replacing the base class of LocalizedNode!
That much is already unsettling. In OO type checking, you must verify that a subclass conforms to its superclass. How do you do this if you can't see the real superclass?
To complete the trap, imagine Node has a method "name" that returns a String. Node' overrides this and--against my rules--returns type AsciiString, because its names only have 7-bit characters in them. LocalizedNode, meanwhile, overrides the name method to look up names in a translation dictionary, so it's very much using Unicode strings. Now imagine calling "name" on a variable of static type Node'. Statically, you expect to get an AsciiString back. However, at run time, this variable might hold a LocalizedNode, in which case you'll get a String. Boom.
Given all this, if you want type checking, then virtual classes are in the research frontier. One reasonable response is to ditch type checking and write code the way you like. Another approach is to explore alternatives to virtual classes. One possible alternative is to look into "coloring", as in Colored FJ.

Sunday, October 21, 2012

Source Maps with Non-traditional Source Code

I recently explored using JavaScript source maps with a language very different from JavaScript. Source maps let developers debug in a web browser while still looking at original source code, even if that source code is not JavaScript. A lot of programming languages support them nowadays, including Dart, Haxe, and CoffeeScript.

In my case, I found it helpful to use "source" code that is different from what the human programmers typed into a text editor and fed to the compiler. This post explains why, and it gives a few tricks I learned along the way.

Why virtual source?

It's might seem obvious that the source map should point back to original source code. That's what the Closure Tools team designed it for, and for goodness' sake, it's called a source map. That's the approach I started with, but I ran into some difficulties that eventually led me to a different approach.

One difficulty is a technical one. When you place a breakpoint in Chrome on a file mapped via a source map, it places one and only one breakpoint in the emitted JavaScript code. That works fine for a JavaScript-to-JavaScript compiler, but I was compiling from Datalog. In Datalog, there are cases where the same line of source code is used in multiple places in the output code. For example, Datalog rules are run in two different modes: once during the initial bootstrapping of a database instance, and later during an Orwellian "truth maintenance" phase. With a conventional source map, it is only possible to breakpoint one of the instances, and the developer doesn't even know which one they are getting.

That problem could be fixed by changes to WebKit, but there is a larger problem: the behavior of the code is different in each of its variants. For example, the truth maintenance code for a Datalog rule has some variants that add facts and some that remove them. A programmer trying to make sense of a single-stepping session needs to know not just which rule they have stopped on, but which mode of evaluation that rule is currentlty being used in. There's nothing in the original source code that can indicate this difference; in the source code, there's just one rule.

As a final cherry on top of the excrement pie, there is a significant amount of code in a Datalog runtime that doesn't have any source representation at all. For example, data input and data output do not have an equivalent in source code, but they are reasonable places to want to place a breakpoint. For a source map pointing to original source code, I don't see a good way to handle such loose code.

A virtual source file solves all of the above problems. The way it works is as follows. The compiler emits a virtual source file in addition to the generated JavaScript code. The virtual source file is higher-level than the emitted JavaScript code, enough to be human readable. However, it is still low-level enough to be helpful for single-step debugging.

The source map links the two forms of output together. For each character of emitted JavaScript code, the source map maps it to a line in the virtual source file. Under normal execution, web browsers use the generated JavaScript file and ignore the virtual source file. If the browser drops into a debugger--via a breakpoint, for example--then it will show the developer the virtual source file rather than the generated JavaScript code. Thus, the developer has the illusion that the browser is directly running the code in the virtual source file.

Tips and tricks

Here are a few tips and tricks I ran into that were not obvious at first.

Put a pointer to the original source file for any code where such a pointer makes sense. That way, developers can easily go find the original source file if they want to know more context about where the code in question came from. Here's the kind of thing I've been using:

    /* browser.logic, line 28 */

Also, for the sake of your developers' sanity, each character of generated JavaScript code should map to some part of the source code. Any code you don't explicitly map will end up implicitly pointing to the previous line of virtual source that does have a map. If you can't think of anything to put in the virtual source file, then try a blank line. The developer will be able to breakpoint and single-step that blank line, which might initially seem weird. It's less weird, though, than giving the developer incorrect information.

Name your JavaScript variable names carefully. I switched generated temporaries to start with "z$" instead of "t$" so that they sort down at the bottom of the variables list in the Chrome debugger. That way, when an app developer looks at the list of variables in a debugger, the first thing their eyes encounter are their own variables.

Emit variable names into the virtual source file, even when they seem redundant. It provides an extra cue for developers as they mentally map what they see in the JavaScript stack trace and what they see in the virtual source file. For example, here is a line of virtual source code for inputting a pair of values to the "new_input" Datalog predicate; the "value0" and "value1" variables are the generated variable names for the pair of values in question.

    INPUT new_input(value0, value1)

Implementation approach

Implementing a virtual source file initially struck me as a cross-cutting concern that was likely to turn the compiler code into a complete mess. However, here is an approach that makes it not so bad.

The compiler already has an "output" stream threaded through all classes that do any code generation. The trick is to augment the class used to implement that stream with a couple of new methods:

emitVirtual(String): emit text to the virtual source file
startVirtualChunk(): mark the beginning of a new chunk of output

With this extended API, working with a virtual source file is straightforward and non-intrusive. Most compiler code remains unchanged; it just writes to the output stream as normal. Around each human-comprehensible chunk of output, there is a call to startVirtualChunk() followed by a few calls to emitVirtual(). For example, whenever the compiler is about to emit a Datalog rule, it first calls startVirtualChunk() and then pretty prints the code to the emitVirtual() stream. After that, it emits the output JavaScript.

With this approach, the extended output stream becomes a single point where the source map can be accumulated. Since this class intercepts writes to both the virtual file and the final generated JavaScript file, it is in a position to maintain a mapping between the two.

The main downside to this approach is that the generated file and the virtual source file must put everything in the same order. In my case, the compiler is emitting code in a reasonable order, so it isn't a big deal.

If your compiler rearranges its output in some wild and crazy order, then you might need to do something different. One approach that looks reasonable is to build a virtual AST while emitting the main source code, and then only convert the virtual AST to text once it is all accumulated. The startVirtualChunk() method would take a virtual AST node as an argument, thus allowing the extended output stream to associate each virtual AST node with one or more ranges of generated JavaScript code.

Monday, August 6, 2012

Deprecation as product lines

I would like to draw a connection between two lines of research: deprecation, and product lines. The punchline is that my personal view on deprecation could be explained by reference to product lines: deprecation is a product line with just two products. To see how that connection works, first take a look at what each of these terms means.

A product line is a collection of products built from a single shared pool of source code. Some examples of a product line would be:

The Android, iPhone, Windows, and Macintosh versions of an application.
The English, Chinese, and Lojban versions of an application.
The trial, normal, and professional versions of an application.
The embedded-Java and full-Java versions of a Java library.

There is a rich literature on product lines; an example I am familiar with is the work on CFJ (Colored Featherweight Java). CFJ is Java extended with "color" annotations. You "color" your classes, methods, and fields depending on which product line each part of the program belongs to. A static checker verifies that the colors are consistent with each other, e.g. that the mobile version of your code does not invoke a method that is only present on the desktop version. A build-time tool can build individual products in the product line by extracting just the code that goes with a designated color. To my knowledge, CFJ has not been explicitly used outside of the CIDE tool it was developed with, and CIDE itself does not appear to be widely used. Instead, the widely used tools for product lines don't have a good theoretical grounding.

Deprecation, meanwhile, is the annotation of code that is going away. As with CFJ, deprecation tools are very widely used but not well grounded theoretically. With deprecation, programmers mark chunks of code as deprecated, and a compile time checker emits warnings whenever non-deprecated code accesses deprecated code. I have previously shown that the deprecation checker in Oracle javac has holes; there are cases where removing the deprecated code results in a a program that either does not type check or that does not behave the same.

As much as I enjoyed working on a specific theoretical framework for deprecation, I must now admit that it's really a special case of CFJ. For the simpler version of deprecation checking, choose two colors, non-deprecated and everything, and mark everything with the "everything" color. You then have two products in the product line: one where you leave everything as is, and one where you keep only the non-deprecated code.

There is a lot of potential future work in this area; for this post I just wanted to draw the connection. I believe CFJ would benefit from explicitly claiming that the colored subsets of the program have the same behavior as the full program; I believe it has this property, and I went to the trouble of proving it holds for deprecation checking. Also, I believe there is fruitful work in studying the kinds of colors that are available. With deprecation, there is usually no point in time where you can remove all deprecated code in the entire code base. You want to have a number of colors for the deprecated code, for example different colors for different future versions of the software.

Sunday, July 8, 2012

Evan Farrer Converts Code from Python to Haskell

Evan Farrer has an interesting post up where he converts some code from Python to Haskell. Kudos to Farrer for empirically studying a language design question. Here is his summary:

The results of this experiment indicate that unit testing is not an adequate replacement for static typing for defect detection. While unit testing does catch many errors it is difficult to construct unit tests that will detect the kinds of defects that would be programatically detected by static typing. The application of static type checking to many programs written in dynamically typed programming languages would catch many defects that were not detected with unit testing, and would not require significant redesign of the programs.

I feel better about delivering code in a statically typed language if the code is more than a few thousand lines long. However, my feeling here is not due to the additional error checking you get in a statically typed language. Contra Farrer's analysis, I feel that this additional benefit is so small as to not be a major factor. For me, the advantages are in better code navigation and in locking developers down to using relatively boring solutions. Both of these lead to code that will stay more robust as it undergoes maintenance.

As such, the most interesting piece of evidence Farrer raises is that the four bodies of code he converted were straightforward to rewrite in Haskell. We can conclude, for these four small programs, that the dynamic features of Python were not important for expressiveness.

On the down side, Farrer's main conclusion is as much undermined by his evidence as supported. His main claim is that Haskell's type checker provides substantial additional error checking compared to what you get in Python. My objection is that all programs have bugs, and doing any sort of study of code is going to turn up some of them. The question is in the significance of those bugs. On this criterion the bugs Farrer finds do not look very important.

The disconnect is that practicing programmers don't count bugs by number. The attribute they care about is the overall bugginess of the software. Overall bugginess can be quantified in different ways; one way to do it is to consider the amount of time lost by end users due to bugs in the software. Based on this metric, a bug that loses a day's work for the end user is supremely important, more important than any feature. On the other hand, a bug that merely causes a visual artifact, and not very often, would be highly unimportant.

The bugs Farrer reports mostly have to do with misuse of the software. The API is called in an inappropriate way, or an input file is provided that is bad. In other words, the "bugs" have to do with the software misbehaving if its preconditions are not met, and the "fix" is to update the software to throw an explicit error message rather than to progress some distance before yielding a walk back on a dynamic type error.

At this point in the static versus dynamic face off, I would summarize the score board as follows:

You can write industry-standard code in either style of language.
Static typing does not automatically yield non-buggy software. Netscape Navigator is a shining example in my mind. It's very buggy yet it's written in C++.
Static languages win, by quite a lot, for navigating code statically.
It's unclear which language gives the more productive debugging experience, but both are quite good with today's tools.
Testing appears to be adequate for finding the bulk of the significant errors that a type checker would find.
Static languages run faster.
Dynamic languages have consistently fast edit-run cycles; static languages at best tie with dynamic languages, and they are much worse if your development setup is off the beaten path.
Expressiveness does not align well with staticness. To name a few examples, C is more expressive that BASIC, Python is better than C, and Scala is better than Python.

Wednesday, March 28, 2012

Shapiro on compiling away abstraction

Via Lambda the Ultimate, I see that Jonathan Shapiro has a rambling retrospective on BitC and why he thinks it has gotten into a dead end.

One of the several themes is that the following combination of design constraints cause trouble:

He wants good performance, comparable to C++.
He wants a better set of abstraction facilities than C++.
He wants separate compilation to do most of the work, like in C++, rather than have the runtime do most of the real compilation, as in Java.

It's hard to excerpt, but here's him explaining the way this all works in C++:

In C++, the "+" operator can be overloaded. But (1) the bindings for primitive types cannot be replaced, (2) we know, statically, what the bindings and representations *are* for the other types, and (3) we can control, by means of inlining, which of those operations entail a procedure call at run time. I'm not trying to suggest that we want to be forced to control that manually. The key point is that the compiler has enough visibility into the implementation of the operation that it is possible to inline the primitive operators (and many others) at static compile time.

To contrast, BitC has trouble due to its extra level of abstraction:

In BitC, *both* of these things *are* abstracted at static compile time. It isn't until link time that all of the representations are in hand.

He goes on to consider the implications of different points in the design space. One point he brings up is that there is another stage of compilation that can be helpful to exploit: install time. Instead of compile time, run time, or even the link time for an application, you can get a lot of leverage if you apply compilation techniques at the point that a collection of applications and libraries are installed onto a system.

Web toolkits are a different domain than Shapiro is thinking about, but they face this particular question as well. You can greatly improve web applications if the tools do some work before all the source code gets to the web browser in front of the user. Without tools, if you just hack JavaScript files by hand and post them on a static HTTP server, the web browser ends up lazily linking the program, which means the application takes longer to start up. Good toolkits do a lot of work before the code makes it down to the end user, and in particular they really go to down at link time. At link time, the entire program is available, so it's possible to divide the program content--both programmatic code and media resources--into reasonably sized bundles of downloadable content.

Wednesday, January 25, 2012

The good and bad of type checking, by example

I enjoyed watching Gilad Bracha present Dart to a group of Stanford professors and students. As one might expect, given Bracha's background, he spends considerable time on Dart's type system.

Several members of the audience seemed surprised to find a defender of Dart's approach to typing. They understand untyped languages such as JavaScript, and they understand strictly typed languages such as Java. However, they don't understand why someone would intentionally design a language where types, when present, might still fail to hold up at run time.

One blog post will never convince people one way or another on this question, but perhaps I can show the motivation and dissipate some of the stark mystification around Dart's approach. Let me provide two examples where type checking would complain about a program. Here's the first example:

String fileName = new File("output.txt");

I find examples like this very compelling. The programmer has made an easy mistake. There's no question it is a mistake; this code will always fail when it is run. Furthermore, a compiler can easily detect the mistake simply by assigning a type to each variable and expression and seeing if they line up. Examples like this make type checking look really good.

On the other hand, consider this example:

void drawWidgets(List<Widget> widgets) { ... }
List<LabelWidget> labels = computeLabels();
drawWidgets(labels);

This program is probably fine, but a traditional type checker is forced to reject it. Even though LabelWidget is a subclass of Widget, a List<LabelWidget> is not a subtype of List<Label>, so the function call in the last line will not type check. The problem is that the compiler has no way of knowing that drawWidgets only reads from its input list. If drawWidgets were to add some more widgets to the list, then there would be a type error.

There are multiple ways to address this problem. In Java, programmers are expected to rewrite the type signature of drawWidgets as follows:

void drawWidgets(List<? extends Widget> widgets) { ... }

In Scala, the answer would be to use an alternate List type that is covariant in its argument.

Whatever the solution, it is clear that this second example has a much different impact on developer productivity than does the first one. First of all, in this second example, the compiler is probably wrong, and it is just emitting an error to be on the safe side. Second, the corrected version of the code is much harder to understand than the original; in addition to parametric types, it also uses an bounded existential type variable. Third, it raises the bar for who can use this programming language. People who could be perfectly productive in a simply typed language will have a terrible time with quantifier-happy generic Java code. For a host of reasons, I feel that on net the type checker makes things worse in this second example. The cases where it prevents a real error are outweighed by all the problems.

Dart's type system is unusual in that it is consistent with both examples. It rejects code like the first example, but is quiet for code like the second one.

Tuesday, November 8, 2011

Cloud9 is hitting limits with JavaScript

Cloud9 has a well-written call for adding classes to JavaScript:

Adding classes to JavaScript is technically not hard -- yet, its impact can be profound. It lowers the barrier to entry for new JavaScript developers and reduces the incompatibility between libraries. Classes in JavaScript do not betray JavaScript’s roots, but are a pragmatic solution for the developer to more clearly express his or her intent. And in the end, that’s what programming languages are all about.

Their argument about library compatibility seems strong to me. It is reasonable to write a Python or Java library that stands alone and has minimal external dependencies. With JavaScript, however, the temptation is strong to work within a framework like Dojo or JQuery just so that you get basic facilities like, well, classes. It's a good argument. If I were working on a large JavaScript code base, however, I'd be strongly tempted to switch over to Dart. It already has the basic language facilities they yearn for, and it's going to move forward much more quickly.

Monday, October 10, 2011

Dart spec is online

Google has posted a technical overview and a language specification for Dart, their new programming language for web browsers.

In short, the language is a skin around JavaScript. It provides syntax for parts of JavaScript that are left to convention, and it is designed to be easily compilable to JavaScript. It has optional types, class definitions, and a module syntax.

The type system has some controversial aspects, in particular an explicit choice not to bother about soundness. If I understand correctly, assigning an apple to a variable holding bananas would cause an error, but assigning an unknown fruit to a variable holding bananas would not. The idea is to pick up the egregious errors and otherwise leave the programmers alone.

Hat tip to Lambda the Ultimate, which has several interesting discussions in the comments.

The jlouis blog has a detailed breakdown of what's in the language. Delightfully, he includes the following homage to Blade Runner:

If all you know is Javascript, the language is probably pretty neat. But for a guy who has seen things you wouldn't believe. Haskell ships off the shoulder of MultiCore CPUs. Watched OCaml glitter in the dark near TAPL. All those moments will be lost in time, like tears in rain. Time to die.

Friday, August 19, 2011

Why LaTeX?

Lance Fortnow laments that no matter how crotchety he gets he can't seem to give up LaTeX:

LaTeX is a great system for mathematical documents...for the 1980s. But the computing world changed dramatically and LaTeX didn't keep up. Unlike Unix I can't give it up. I write papers with other computer scientists and mathematicians and since they use LaTeX so do I. LaTeX still has the best mathematics formulas but in almost every other aspect it lags behind modern document systems.

I think LaTeX is better than he gives it credit for. I also think it could use a reboot. It really is a system from the 80s, and it's... interesting how many systems from the 70s and 80s are still the best of breed, still in wide use, but still not really getting any new development.

Here's my hate list for LaTeX:

The grammar is idiosyncratic, poorly documented, and context-dependent. There's no need for any of that. There are really good techniques nowadays for having a very extensible language nonetheless have a base grammar that is consistent in every file and supports self-documentation.
You can't draw outside the lines. For all the flexibility the system ought to have due to its macro system, I find the many layers of implementation to be practically impenetrable. Well written software can be picked up by anyone, explored, and modified. Not so with LaTeX--you have to do things exactly the way the implementers imagined, or you are in for great pain and terrible-looking output.
The error messages are often inscrutable. They may as well drop all the spew and just say, "your document started sucking somewhere around line 1308".
The documentation is terrible. The built-in documentation is hard to find and often stripped out anyway. The Internet is filled with cheesy "how to get started" guides that drop off right before they answer whatever question you have.
Installing fonts is a nightmare. There are standalone true-type fonts nowadays. You should be able to drop in a font and configure LaTeX to use it. That this is not possible suggests that the maintainers are as afraid of the implementation as I am.
Files are non-portable and hard to extract. This problem is tied up in the implementation technology. Layered macros in a global namespace are not conducive to careful management of dependencies, so everything tends to depend on everything.

However, as bad as that list is, the pros make it worth it:

Excellent looking output, especially if you use any math. If you care enough to use something other than ASCII, I would think that the document appearance trumps just about any other concern.
Excellent collaborative editing. You can save files in version control and use file-level merge algorithms effectively. With most document systems, people end up mailing each other the drafts, which is just a miserable way to collaborate.
Scripting and macros. While you can't reasonably change LaTeX itself, what you can easily do is add extra features to the front end by way of scripts and macros.
It uses instant preview instead of WYSIWYG. WYSIWYG editors lead to quirky problems that are easy to miss in proofreading, such as headers being at the wrong level of emphasis. While I certainly want to see what the output will look like all the time, I don't want to edit that version. I want to edit the code. When you develop something you really want to be good, you want very tight control.
Scaling. Many document systems develop problems when a document is more than 10-20 pages long. LaTeX keeps on chugging for at least up to 1000-page documents.

I would love to see a LaTeX reboot. The most promising contender I know of is the Lout document formatting system, but it appears to not be actively maintained.

Saturday, July 30, 2011

Cedric on type erasure

I've been meaning to get around to posting on type erasure, and Cedric Beust beat me to it:

The main problem is that reified generics would be incompatible with the current collections.... The extra type information also impacts the interoperability between languages within the JVM but also outside of it.

I completely agree. I used to rail on erasure until I got more experience with it.

The interoperability issue is one big reason I now like erasure. With erased types, the interop layer uses only a very simple type system. Knowledge of complicated type systems stays within the compilers for individual languages.

An additional reason is that it puts the cost of type checking in the compiler rather than in the runtime. With erased types, the compiler works hard to do its type checking, and if it signs off, the code is known to be type safe. At runtime, the types disappear and the code runs at full speed.

This property is more than just pretty. It is very helpful to an engineer trying to build anything using the language. When you write code, you want to know how it is going to perform. With erasure, the things you write convert directly to machine code, just with extra details added such as which variable goes in which register. With reification, you end up with extra crud being inserted everywhere. To understand performance under reified types, you have to reason about this additional type crud. You'd rather not have to.

Thursday, June 9, 2011

Two kinds of type inference

There are two separate lines of work on type inference. While they are superficially similar, they face very different design constraints. Let me explain a few of those differences. To begin with, consider the following program. The two kinds of inference reach different conclusions about it.

  class Foo {
    var x: Object = "a string"
    var y: String = "initial value"
    def copy() {
      y = x    // type error, or no?
    }
  }

Should this example type check? There are two points of view.

One point of view is that the compiler is able to prove that x only ever holds strings. Therefore y only ever holds strings. Thus, there will never be a type error at run time, and so the code can be allowed to run as is. This point of view might be called information gathering. The tool analyzes the code, typically the whole program, and learns type information about that code.

Another point of view is that x holds objects and y holds strings, so the "x = y" line is a problem. Yes, the current version of x only holds circles. However, the "x = y" line is using x outside of its spec, and you don't want to use a variable outside of its spec even if you can temporarily get away with it. This point of view might be called slots and tabs. The slot y is not specced to hold the tab x. From this point of view, the indicated line is an error even though the program doesn't have any real problems.

Every user-facing type checker I am familiar with is based on the slots and tabs point of view. The idea is that programmers use types to provide extra structure to their programs. They don't want even nominal violations of the structure; they really want their types to line up the way they say they do. As a concrete example, imagine the author of Foo checks in their code, and someone else adds to the program the following statement: "foo.x = Integer.valueOf(12)". Now the nominal type error has become a real one, but the author of Foo has already gone home. It's better if the author of Foo found out the problem rather than someone else.

That's one example difference between the two kinds of type inference. A slots-and-tab checker will flag errors that an information-gatherer would optimize away. Here are three other design constraints that differ between the two.

Declarations are important for a type checker. For the type checker to know what the slots and tabs are specced as, it must have declared types. In the above example, if x and y did not have declared types on them, then the type checker for class Foo could not determine that there is a problem. To contrast, an information gatherer doesn't necessarily pay much attention to declarations. It can usually infer better information by itself, anyway.

Changing a published type checker breaks builds. For a language under development, once a type checker has been published, it takes great care to change it without breaking any existing builds. Consider the addition of generics to Java 1.5, where it took a great deal of energy and cleverness to make it backwards compatible with all the existing Java code in the world. To contrast, an information gathering type inference can be swapped around at whim. The only impact will be that programs optimize better or worse or faster or slower than before.

Type checkers must be simple. The type system of a slots-and-tabs type checker is part of the contract betwen the compiler and a human developer. Human beings have to understand these things, human beings that for the most part have something better to do with their time than study types. As a result, there is tremendous design pressure on a slots-and-tabs type checker to make the overall system simple to understand. To contrast, the sky is the limit for an information gatherer. The only people who need to understand it are the handful of people developing and maintaining it.

Overall, I wish there was some term in common use to distinguish between these two kinds of type inferencers. Alas, both kinds of them infer things, and the things both of them infer are types, so the terminology seems inevitable. The best that can be done is to strive to understand which kind of type inferencer one is working with. Developers on one or the other face different design constraints, and they will find different chunks of the published literature to be relevant.

Thursday, June 2, 2011

Martin Odersky on the state of Scala

Martin Odersky talked today at ScalaDays about the state of Scala. Here are the highlights for me.

Tools and Documentation

He doesn't think documentation is a big issue at this point. People used to complain to him about it all the time, and that has largely died out. I know that this demand for documentation was a large motivation for him to develop a book about Scala. Perhaps it helped.

He still thinks IDE support is an important problem. However, he claims the Eclipse support has gotten good enough for him to switch his compiler hacking from Emacs to Eclipse. That's high praise--Emacs users don't switch! I will have to try it again.

He emphasizes binary compatibility. In practice, libraries need to be recompiled when a new version of Scala comes out, because inevitably some trait or another has a new method in it. He has a number of ways to address this. He's long talked about tools to detect and fix problems by analyzing the bytecodes, and that work is going to be emphasized at TypeSafe. Additionally, new today is that he plans to designate stable releases of Scala that stay valid for a number of years and never have binary-incompatible changes.

He also pointed out that style checking tools would be helpful in larger development groups. It's a good point. Such tools don't take a lot of code, but I guess nobody has gotten interested enough in the problem to whip one up.

Functional programming

Martin went through an extended example based on a 2000 study comparing programming languages. In the study, students implemented a programming problem in one of seven different programming languages. The study is interesting on its own if you haven't read it before, and among other things shows that there is much more variability among programmers than among programming languages. However, we can learn something about programming languages by comparing either the best or the median solutions in each language.

Scala shines well on the programming problem used in the study, and it's because of Scala's excellent support for running functions across collections. Such facilities don't work well unless the language has a concise notation for functions. Here is the bottom line on several different solutions:

Students using compiled languages: 200-300 lines of code
Students using scripting languages: ~100 lines of code
Martin's solution in Scala: 20-30 lines of code
A Java master's solution in Java: over 100 lines of code

I will attribute the "Java master" if I can find a reliable source, but Martin showed the Java example and it looked pretty reasonable at a glance. The reason it is so long compared to the Scala solution is that instead of using collection operations, it defines several recursive methods that record their progress in extra parameters to the methods. I've written a lot of code like that in the last few years. I think about the beautiful functional solution, and then I start over with an imperative solution because inner classes in Java require ever so much boilerplate.

Parallelism

Martin talked a bit on his latest thinking on making use of multiple cores, a problem that has obsessed programming-language research for several years now. One principle he emphasized is that people are much more able to find one solution that works than at finding all the potential problems that can occur due to non-determinism. Thus, he's interested lately in programming-language constructs that are parallel yet still deterministic. That's a tough principle to achieve in an expressive language! It rules out all of actors (!), agents, and software-transactional memory, because they all have state, and the state can change differently depending on the non-deterministic choices the implementation makes.

Aside from the general talk on the challenges of parallelism, Martin talked a bit about the parallel collections in Scala. They're better than I realized. Their implementation uses fork-join with work stealing, rather than blindly creating lots of threads. As an additional twist, they adaptively choose a chunk size based on how much work stealing appears to be happening. If there is no work stealing, then every node must be busy, so increase the chunk size.

To demonstrate how the collections can help, he made two tweaks to the phone-number solution to switch from sequential to parallel collections. After doing so, the program ran 2.5 times faster on a 4-core machine. One can imagine doing better than 2.5 times faster, but it's a very good start given how very easy the change was.

Domain-specific languages

Martin emphasized that Scala is excellent for DSLs. However, he points out, and I agree, that embedded DSLs in Scala mesh so well that they are essentially just Scala code. I vastly prefer this style of DSL to the style where you embed the DSL but the constructs of the DSL don't map well to constructs in the host language. Since the code is doing something different from what it looks like it does, all kinds of weird bugs can arise. Whenever working on a DSL that doesn't embed in a straightforward way, I'd prefer to make it an external DSL with a plain old parser and interpreter.

Wednesday, May 11, 2011

Sven Efftinge on Java-killer languages

I just ran across Sven Efftinge's fun post on what he wants to see in a Java-killer language.

My list would be something like: remove boilerplate for common coding arrangements, make things easier to understand, be compatible with existing Java code, and otherwise leave everything alone.

Sven has a more detailed list. Here are his bullet points and some thoughts on them:

1. Don't make unimportant changes. Gosh yes. Changing = to :=, or changing the keywords, adds a barrier to entry for anyone learning the language. Don't do it without a real benefit.

2. Static typing. Static typing is one of those choices where the up-front choice is far from obvious and has many intangibles, but once you choose, many of the follow-up choices are fairly clear to people who know the area. I think it is perfectly reasonable to have untyped languages on the JVM, and I think it's perfectly reasonable to have simply typed languages with generics only used for collections. Note that the choice will strongly influence what sorts of applications the language is good for, however. Additionally, I would emphasize that today's type systems have gotten more convenient to use, so the niche for untyped languages is smaller than it used to be.

3. Don't touch generics. Java's type system is long in the tooth. While its basic parametric types are fine, there are parts that are simply bad: raw types, wildcards, arrays, and primitive types. If you are developing a Java killer, improving the type system is one of the ways you can improve the language. You'd be crazy not to consider it.

4. Use type inference. Absolutely. This is a large source of boilerplate in Java.

5. Care about tool support (IDE). I agree. When I joined the Scala project in 2005, I was glad to see that the core team was working on a number of tools, including: scaladoc, the scala command (repl, script runner, and object runner), scalap, ant tasks, and the Eclipse plugin. Nowadays there are even more tools, including an excellent IntelliJ plugin and integration with a larger number of build tools.

In a nutshell, making programmers productive requires more than a good programming language. There are huge benefits to good tools and rich libraries. The overall productivity of a programmer is something like the product of language, tools, and libraries.

6. Closures. Yes, please. The main reason to leave it out historically is the lack of garbage collection. I don't understand why Java has been so slow to adopt them, and I was terribly saddened to hear Guy Steele at OOPSLA 1998 pronouncing that Java didn't look like it really needed closures. It was surreal given the content of the talk that he had given just minutes before.

7. Get rid of old unused concepts. Yes, in general. However, this can be hard to do while also maintaining compatibility and generally letting people write things in a Java way if they want. For the specific things Sven lists: totally agreed about fall-through switch; totally agreed about goto, but it's not in Java anyway; not so sure about bit operations. Bit operations are useful on the JVM, and besides, Java's numerics work reasonably already. Better to focus on areas where larger wins are possible.

Tuesday, March 29, 2011

When stateful code is better

One branch of functional-programming enthusiasts have long strived to eliminate state from programming. By doing so, you end up with program code that supports equational reasoning. If you know a=b in one part of the code, then you can freely replace a by b and b by a anywhere else in the code. Since there's no state, the program will still behave the same. It's good stuff.

Nonetheless, state is essential for good code in most languages. You don't want to live without it. Take a moment to consider the more practical of functional programming languages, and see how programmers in those languages have voted with their feet. ML, the most popular typed, strict functional language, includes ref cells in the core language. Haskell, the most popular typed, lazy functional language, includes not just the state monad but also UnsafeIO. Lisp and Scheme, the most popular untyped functional languages, shamelessly include state everywhere. It's a clean sweep. Functional programmers are using languages that have state.

Why is this? Let me describe a couple of programming problems that any practical language needs to be able to solve. Both of these problems are easy with state and hair-pulling without. Any language without state will give programmers a tough time with these problems, so such languages don't become popular. The two problems are: logging, and the model-view architecture.

With logging, what you'd like to do is write your program as normal and then insert log messages here and there in the code. As much as possible, you want to avoid disturbing the core logic of the program with the logging behavior. Solving this problem using state is so easy it's hard to even talk about. All you do is use that state, typically an external file system. Every time you want to log something, write the message to the state.

What if you want to log without state? In that case, you have to pass around the log as a parameter to every function in the program that might want to log anything. Every function gets an extra parameter which is the state of the log before the function call. Such functions must also return that extra parameter, after updating, when they finish. This approach has two large problems. First, it requires pervasive changes throughout the code base to pass around the latest version of the log. Second, it's highly error prone. For example, the following function logs two messages but accidentally discards the first one:

def doTwoThings(log) = {
val log1 = write(log, "about to do first thing")
doFirstThing()
val log2 = write(log, "about to do second thing")
doSecondThing()
return log2
}

This kind of problem can probably be prevented with linear types, and many functional language researchers would observe this and go do a bunch of research on linear types. Until they come up with something, your best bet is to use state.

A second example is event handling in the model-view architecture that is so pervasive in practical code. In the model-view architecture, you write the program in two layers: one layer for the core model of the software and one layer for the view. Views have a pointer to their models, and whenever the model changes they update themselves. This way, the model code stands alone and can be analyzed and unit tested without needing a user interface. The view, meanwhile, focuses on user interfaces, and can be tested on its own if you stub out the model. It's a fine architecture, well worth its popularity. Here's the challenge for stateless programming: how do you update the model in response to an event?

In a stateful language, what programmers can do is mutate the model in place. Every pointer from the view to the model will still be valid, so the view doesn't need any changes to its structure. Again, it's so simple it is hard to even talk about.

Now consider a stateless language. In a stateless language, you must not only update the model, but must also update any view object that refers to any part of the model that changed. Likewise, you have to update any view object that has a reference to any such view object, transitively. There's no theoretical bar from programming like this. However, your process-event object ends up taking a view as an argument, just so that it can update all the pointers from the view to the updated model. This approach is tedious and error-prone in the same way as with stateless logging. It's very easy to leave some parts of the view pointing to old parts of the model. If you do that, things will mostly work, but there will be stale data in parts of the view.

In general, stateless code is usually better. However, I can't escape believing that for logging and for the model-view architecture, it's the stateful version that is best. These problems share the aspect that there are references from one component (main application, view) to a separate component (log file, model), and the second component is undergoing change. By letting the reference be stateful, the two components can work with each other at arm's distance. Contrary to its usual reputation, state in such cases is a help rather than a hindrance for building useful abstractions.