Wednesday, December 30, 2009

Run-time types

Guy Steele ruminated a few weeks ago about run-time types:

When Bill Joy and I were working on The Java Language Specification 14 years ago, we wrestled with the distinction between "types" and what many people were calling "run-time types". We found that the the term "run-time types" was causing some confusion, in part because some people seemed to think (a) that an object could have a compile-time type as well as a run-time type, and/or (b) a variable could have a run-time type as well as a compile-time type, and/or (c) that any type could serve as either a compile-time type or a run-time type. In the end, we tried to abolish the term "run-time type" in Java by abolishing it in the language specification.


This issue is coming up for the GWT compiler as we enrich the type hierarchy used within the compiler, only worse. Not only are there types in the source language that differ from the types available at run time, but there is a third type hierarchy used internally by the compiler. How do we make sense of all these type systems?

Steele and Joy have tried to eliminate the very notion of run-time types. However, as the post above indicates, they still haven't found a satisfactory replacement for it. One attempt is to use "class" instead of "run-time type", but how are we supposed to think about the run-time type of an array? Worse, what can GWT do, when it has not two but three type systems of interest? Steele wants a second word to replace "run-time type". GWT would need a third word.

With some trepidation, let me defend why I think run-time types are a useful concept. I have a feeling I am wandering into a quagmire, given the unease of others who have tried it, but so far I don't see exactly where the problem is. I'll give two motivations, and then I'll give some specific details on how I think of the type systems relating to each other.

The first reason to have run-time types is that it's the most obvious thing to call them. Java has a stark distinction between the types in the source language and the types you can test at run time. Thus, there are two systems of type-like things involved. At the same time, both systems are very type-like. They have a subtyping hierarchy, and they have a inclusion relation where some values are in the type and others are not. Thus, we are talking about type-like things that only exist at run time. It's simple English to call these things run-time types.

The second reason is that it works out well in formal language semantics. In Igarashi and Pierce's paper describing the erasure model for compiling Java, you start with a program using source-level types and "erase" parts of the program. For example, type List<String> becomes simply List; the String part is erased. After erasure, you still have a well-typed program. There's nothing special about it. The type system used for the new program is different than the type system for the old program, but it's still a type system in every way. Thus, run-time types appear to be well founded theoretically.

Based on motivations like these, I like to think of run-time types as a first-class concept. More generally, I think of all three of the type systems the GWT compiler sees as plain old type systems.

The trick, then, is how to think of the relations between these type systems. First of all, the GWT compiler largely ignores the Java source-level types. Those are modeled faithfully, and they are available during user-supplied code generation. However, as soon as all code has been loaded or generated, the types are erased, just like in a Java compiler that targets bytecode. Thus, the bulk of the compiler doesn't even see source-level types.

More interesting is the interpretation of the compiler's internal type system. I think of it as supplying extra information that the compiler knows to be true. For example, the internal type system has a "null type" that isn't available in either the source language's type system or the run-time type system. Whenever the compiler can prove that an expression will always evaluate to null, it can replace that expression's type with the null type. Whenever the compiler sees an expression of type null, it can use that for further optimization. For example, if it sees "foo == null ? a : b", and the type of "foo" is the null type, it can drop the "b" and replace it by "foo; a". It knows that "foo==null" will always be true.

The main mind-bender I have encountered with this approach has to do with the fact that compiler types can't be fully checked at run time. What is the meaning of a cast expression where the target type is something that won't be checked at run time? How should it be compiled? Should we feel compelled to add extra run-time checks corresponding to the enriched internal type system?

My current answer is to simply avoid the whole question. It's presently an invariant of the GWT compiler that every cast expression and every instanceof check tests against a run-time type. Problem solved. This approach is a cruder version of the "dcast" predicate used in Igarashi and Pierce. Looking forward, it would also be possible to add a second kind of cast operation, one that the compiler has proven will always succeed. Such a cast can cast to an arbitrary internal type, because it won't be emitted into the final output. So far, though, such a cast hasn't been strictly necessary.

Overall, run-time types look not only useful, but well founded and intuitive. They're types, and they're present at run time.

2 comments:

Robert "kebernet" Cooper said...

Forgive me, but is the "Runtime type" even a real thing in GWT?

In Gwittir, my introspection system, we use a big two dimensional sort to eliminate multiple inheritance (via interface) based on instanceof evals, because they are compile time. GWT have never supported "real" runtime types as the Java world would consider them.

This isn't even about reified types, it is about raw types against interfaces, which GWT doesn't even (really) support right now.

The big thing is, it seems to me your whole argument is about compile time boundaries that only affect GWT-RPC, and the whole world has gotten around that already. If you are saying GWT is going to implement java.lang.Class in a more meaningful way, that will make my life easier, but I think it would be detrimental to the executable as a whole.

Lex Spoon said...

Yes, GWT has real run-time types, the same as Java. You can do type tests at run time by using instanceof and casts, but such tests are only accurate up to the limitations of the types actually present at run time.

Java calls its run-time types "classes", but I find that awkward. It strikes me as a contortion to say that "array of int" is a class.