Lex Spoon: 2014

Friday, December 5, 2014

Goodbye, Spark Plug

Don't take me wrong. I know a dog is just a dog, and a pet is just a pet. There are people reading this who have cancer, and there are some who have outlived their human children. On the scale of life challenges, I've just had maybe a 3/10.

Still, I would like to write a few words. It's a way to organize my thoughts, and a way to say goodbye. I promise the next post will be about programming or law or identity or the web, but that all seems rather dry to me today.

As all you pet owners know, you get a little Pavlovian jolt each time you help out your little ward and they reward you for it. For example, when they greet you at the door and run in circles. Or, when they learn your gait well enough to jump on each leg in time and then jump to the other one before you pick that foot up. When they're cold, and you blanket them up, and they let out a deep sigh of contentment. When there's a burr in their foot, and they plaintively hold it out so that someone with thumbs can do something about it.

Over time it really adds up. You become attuned to where you expect them to be when you walk into a room. You gain a second sense about which ruffles in a couch blanket have a dog under them. You expect if you plink through a few measures of a waltz, that you'll see two brown eyes peek around the corner to see what you're doing. After 18 years of that and then nothing, you are left with a lot of little voids that add up to one great big void.

Some animals go and hide when they become deathly sick, but this one did not. In his last hours he came to me to fix it. Dog or no, it was crushing to see such hope and confusion, yet so little ability to do anything about it.

To anyone out there facing this kind of question, let me tell you that I feel no unease at all about the decision to eschew blood samples, fluid IVs, antibiotics, and I didn't even ask what else to try and give him a little more time. I keep thinking, he was 18, and kidneys don't get better, and he had multiple other problems, anyway. Indeed, what I really wish I could go back in time on is delaying euthanasia for too long. I even had my mind made up, and I went to a 24-hour vet to do it, but I backed down when confronted with a thorough vet that wanted to rehash the dog's entire medical history. I thought I could just take him to our regular vet the next day, but the sun never rose for him again. Yes, hindsight is 20/20, but I wish I had insisted.

Goodbye, Spark Plug. I hope we did well for you.

P.S. -- Mom, you are very optimistic to think we can get this plant to bloom every December. We'll give it a try!

Sunday, November 23, 2014

Is this the right server?

It's nice to see someone else reach the following conclusion:

"For those familiar with SSH, you should realize that public key pinning is nearly identical to SSH's StrictHostKeyChecking option. SSH had it right the entire time, and the rest of the world is beginning to realize the virtues of directly identifying a host or service by its public key."

Verifying a TLS certificate via the CA hierarchy is better than nothing, but it's not really all that reassuring. Approximately, what it tells you is that there is a chain of certification leading back to one or more root authorities, which for some reason we all try not to think about too much are granted the ultimate authority on the legitimacy web sites. I say "approximately", because fancier TLS verifiers can and do incorporate additional information.

The root authorities are too numerous to really have faith in, and they have been compromised in the past. In general, they and their delegates have little incentive to be careful about what they are certifying, because the entities they certify are also their source of income.

You can get better reliability in key verification if you use information that is based on the interactions of the actual participants, rather than on any form of third-party security databases. Let me describe three examples of that.

Pin the key

For many applications, a remotely installed application needs only communicate with a handful of servers back at a central site you control. In such a case, it works well to pin the public keys of those servers.

The page advocates embedding the public key directly in the application. This is an extremely reliable way of obtaining the correct key. You can embed the key in the app's binary as part of your build system, and then ship the whole bundle over the web, the app store, or however else you are transmitting it to the platform it will run on. Given such a high level of reliability, there is little benefit from pulling in the CA hierarchy.

As linked above, you can implement pinning today. It appears to be tricky manual work, though, rather than something that is built into the framework. As well, you don't get to ignore the CA hierarchy by doing this sort of thing. So long as you use standard SSL libraries, you still have to make sure that your key validates in the standard ways required by SSL.

Associate keys with links

The Y property deserves wider recognition, given how important hyperlinks are in today's world. Put simply, if someone gives you a hyperlink, and you follow that hyperlink, you want to reliably arrive at the same destination that the sender wanted you to get to. That is not what today's URLs give you.

The key to achieving this property is that whenever you transmit a URL, you also transmit a hash of the expected host key. There are many ways to do this, including the ones described at the above hyperlink (assuming you see the same site I am looking at as I write this!). Just to give a very simple example, it could be as simple as using URLs of the following form:

     https://hash-ABC123.foo.bar/sub/dir/foo.html

This particular example is interesting for being backward compatible with software that doesn't know what the hash means.

I don't fully know why this problem is left languishing. Part of it is probably that people are resting too easy on the bad assumption that the CA hierarchy has us covered. There's a funny mental bias where if we know nothing about a subject, and we see smart people working on it, the more optimistic of us just assume that it works well. Another part of the answer is that the core protocols of the world-wide web are implemented in many disparate code bases; SSH benefits from having an authoritative version of both the client and the server, especially in its early days.

As things stand, you can implement "YURLs" for your own software, but they won't work as desired in standard web browsers. Even with custom software, they will only work among organizations that use the same YURL scheme. This approach looks workable to me, but it requires growing the protocols and adopting them in the major browsers.

Repeat visits

One last source of useful information is the user's own previous interactions with a given site. Whenever you visit a site, it's worth caching the key for future reference. If you visit the "same" site again but the key has changed, then you should be extremely suspicious. Either the previous site was wrong, or the new one is. You don't know which one is which, but you know something is wrong.

Think how nice it would be if you try to log into your bank account, and the browser said, "This is a site you've never seen before. Proceed?"

You can get that already if you use pet names, which have been implemented as an experimental browser extension. It would be great if web browsers incorporated functionality like this, for example turning the URL bar and browser frame yellow if they see a a site is a new one based on its certificate. Each browser can add this sort of functionality independently, as an issue of quality of implementation.

In your own software, you can implement key memory using the same techniques as for key pinning, as described above.

Key rotation

Any real cryptography system needs to deal with key revocation and with upgrading to new keys. I have intentionally left out such discussion to keep the discussion simple, but I do believe these things can be worked into the above systems. It's important to have a way to sign an official certificate upgrade, so that browsers can correlate new certificates with old ones during a graceful phase-in period. It's also important to have some kind of channel for revoking a certificate, in the case that one has been compromised.

For web applications and for mobile phone applications, you can implement key rotation by forcing the application to upgrade itself. Include the new keys in the newly upgraded version.

Thursday, November 20, 2014

FCC inches away from neutrality

The FCC’s latest proposal for network neutrality rules creates space for broadband carriers to offer “paid prioritization” services.[11] While the sale of such prioritization has been characterized as a stark and simple sorting into “fast” and “slow” traffic lanes,[12] the offering is somewhat more subtle: a paid prioritization service allows broadband carriers to charge content providers for priority when allocating the network’s shared resources, including the potentially scarce bandwidth over the last-mile connection between the Internet and an individual broadband subscriber. Such allocation has historically been determined by detached—or “neutral”—algorithms. The Commission’s newly proposed rules, however, would allow carriers to subject this allocation to a content provider’s ability and willingness to pay.

That's from a review on Standard Law Review a few months ago. I think this evolution in the FCC's approach will benefit the public.

It seems important to consider realistic developments of the Internet. Here's a thought experiment I've used for a long time, and that seems to be happening in practice. Try to imagine what goes wrong if a site like YouTube or Netflix pays--with its own money--to install some extra network infrastructure in your neighborhood, but only allows its own packets to go across that infrastructure. Doing so is a flagrant violation of network neutrality, because packets from one site will get to you faster than packets from another site. Yet, I can't see the harm. It seems like a helpful development, and just the sort of thing that might get squashed by an overly idealistic commitment to neutrality.

As a follow-on question, what changes if instead of Netflix building the infrastructure itself, it pays Comcast to do it? It's the same from a consumer's view as before, only now the companies in question are probably saving money. Thus, it's even better for the general public, yet it's an even more flagrant violation of network neutrality. In this scenario, Netflix is straight-up paying for better access.

It seems that the FCC now agrees with that general reasoning. They not only support content delivery networks in general, but now they are going to allow generic ISPs to provide their own prioritized access to sites that pay a higher price for it.

I believe "neutrality" is not the best precise goal to go for. Rather, it's better to think about a more general notion of anti-trust.

Tuesday, October 14, 2014

Three tiers of classrooms

Via Arnold Kling, I see Jesse Rothstein trying to prove that you can't measure teaching ability, or perhaps even that teaching ability doesn't matter:

Like all quasi-experiments, this one relies on an assumption that the treatment – here, teacher switching – is as good as random. I find that it is not: Teacher switching is correlated with changes in students’ prior-year scores.

It's important to figure out which kind of classroom we are talking about. There are at least three tiers of classroom styles. If you measure only in the middle tier, then I can believe that teacher skill would have only a small effect. However, it's really easy to tell the difference between the tiers if you look, especially for the bottom-most tier compared to the other ones.

At the bottom tier, some classes are just zoos. The teacher is ignored, and the students talk to each other. At best, they work on material for another class. Teacher skill doesn't matter within this tier, from at academic perspective; one zoo teaches students just as much as another zoo. I am sad to say that classrooms like this do exist. It's a potential bright note that such teachers are very easy to identify in an objective way; their students have absolutely terrible results on standardized tests such as Advanced Placement (AP). There's no need for sophisticated statistics if all the students are scoring 1-2 out of 5.

At the middle tier, some classes involve the teacher walking the students through standardized textbooks and other material. Basically, the textbooks are software and the teachers are the hardware that runs it. It's not an inspiring kind of classroom, but at least it is inexpensive. Within this tier, I could see teacher skill not mattering much, because the students spend all their time glued to the course materials. However, you'd certainly like to find out who is in this tier versus in the zoo tier.

Worth a brief mention is that there's an upper tier as well. Maybe "style" is a better word in this case. Sometimes the teacher actually understands the course material, and so is able to respond to the questions with anecdotes and exercises that are tailored for that particular student. For this tier, teacher evaluation is especially important. Among other things, some teachers are fooling themselves, and would be better off staying closer to the book.

Friday, June 27, 2014

Edoardo Vacchi on attribute grammars

I previously wrote that predictable performance is a practical challenge for using attribute grammars on real work. It does little good to quickly write the first version of a compiler pass if you then spend hours debugging oddball performance problems.

Edoardo Vacchi wrote me the following in response. I agree with him: having an explicit evaluation construct, rather than triggering attribute contributions automatically, is likely to make performance more predictable. UPDATED: edited the first paragraph as suggested by Edoardo.

Hi,
This is Edoardo Vacchi from Università degli Studi di Milano (Italy). For my PhD thesis I’m working on a language development framework called “Neverlang”[1,2]. Neverlang is an ongoing project of Walter Cazzola's ADAPT Lab; I am involved with its latest incarnation "Neverlang 2".
I stumbled upon an (old) blog post of yours about Attribute Grammars [3] and I would be interested to know if you knew some “authoritative” references that I could cite with respect to the points that you raise, with particular attention to point (3) “unpredictable performances” and, in part, to (2) caching.
The Neverlang model resembles that of simple “compiler-compilers” like Yacc, where attributes behave more like variables than functions; thus they are generally computed only once; in Neverlang attributes can also be re-computed using the `eval` construct, which descends into a child and re-evaluates the corresponding semantic action.
On the one hand, the need for an explicit `eval` make it less “convenient” than regular AG-based frameworks; on the other hand, I believe this gives better predictability, and, although the focus for the framework are not performances, but rather modularity, I think that “predictability” would better motivate the reasons for this choice.
Thanks in advance,
[1] http://link.springer.com/chapter/10.1007%2F978-3-642-39614-4_2#page-1
[2] http://dl.acm.org/citation.cfm?id=2584478
[3] http://blog.lexspoon.org/2011/04/practical-challenges-for-attribute.html

Edoardo Vacchi is PhD Student at Walter Cazzola's ADAPT-Lab, a research lab at Università degli Studi di Milano that investigates methods and techniques for programming language development and software adaptation and evolution. Walter Cazzola is associate professor at UniMi and his research is concerned with software and language engineering. More info about Neverlang can be found at the website http://neverlang.di.unimi.it.

Tuesday, June 3, 2014

My analysis of the Swift language

Apple has put out Swift, which sounds like a nice language overall. Here is my flagrantly non-humble opinion about how its features line up with what I consider modern, well-established aspects of programming language design.

The good

First off, Swift includes range-checked integer arithmetic! Unless you explicitly ask for wraparound, any overflow will cause an exception. I was just commenting yesterday on what a tough problem this is for current programming languages.

It has function types, nested functions, and closures, and it has numerous forms of optimized syntax for closures. This is all heartening, and I hope it will stick going forward, much the way lexical variable binding has stuck. Closures are one of those things that are both helpful and have small down side once your language has garbage collection.

Swift's closures can assign to variables in an outer scope. That's the right way to do things, and I find it painful how much Java's designers struggle with this issue. As a technical detail, I am unclear what happens if a closure captures a variable but does not modify it. What ought to happen is that any read from it will see the latest value, not the value at the time the capture happened. However, the Closures section of the Language Guide suggests that the compiler will capture just the initial value in this case. I believe this is misguided and will cause as many traps as it fixes; for example, suppose the programmer captured a counter, but does not increment that counter itself? The motto here should be: you don't know what the programmer meant, but you know what they wrote.

Type inference is quite welcome. I don't know what more to say than that developers will take advantage of it all the time, especially for local variables.

Tuple types are a small touch that comes up in practical programming all the time. How many times have you wanted to return two values from a function, and had to design a class for it or otherwise to pervert your design?

Enumerations seem good to include in the language. Language designers often seem to think that enums are already handled by other language features, and therefore should not be included. I respect that, but in this case, it's a simple feature that programmers really like to use. Java's enums are baroque, and none of the several Datalog dialects I have wokred on include enums at all. I miss having language support for a closed set of named integers. It's easy to support and will be extremely popular.

As an interesting trick, keyword arguments to functions are supported, but you have to opt in. That's probably a good combination. Keyword arguments are quite useful in cases where you have a lot of parameters, and sometimes this legitimately happens. However, it's unfortunate if you afflict all functions with keyword arguments, because the keyword arguments become part of the API. By making it opt in, the feature is there for the functions which can use it.

Including both structs and classes looks initially redundant, but it's quite helpful to have a value type that encompasses multiple other values. As an example, the boxed Integer type on Java would be much better as a struct than as a class.

Extensions look valuable for programming in the large. They allow you can make an existing class fit into a new framework, and they let you add convenience methods to an existing class. Scala uses its implicit conversions for extensions, but direct support for extensions also makes a lot of sense.

The way option chaining works is a nice improvement on Objective C. In Objective C, any access to nil returns nil. In practice, programmers are likely better off with getting an error when they access nil, as a form of design by contract: when something goes wrong, you want the program to stop at that point, not some indefinite time later. Still, sometimes you want nil propagation, and when you do, Swift lets you just put a "?" after the access.

Weak references are helpful for any language with automatic memory management, but they look especially helpful in a language with reference-counting memory management. I don't follow why there are also the "unowned" references, except that the designers didn't want your code to get polluted with ! dereferences. Even so, I would think this is a case of do or do not do. If you are worried about ! pollution, which is a legitimate concern, then simply don't require the !.

As an aside, this is the reason I am not sure pervasive null is as bad as often claimed. In practical code, there are a lot of cases where a value is sometimes optional but, in a specific context, is known to be present. In such a case, you are just going to deference it, and possibly suffer a null-pointer check if you were wrong. As such, programmers are guided into a style where they just insert dereferences until the compiler shuts up, which makes the code noisey without increasing practical reliability.

The dubious

Swift looks very practical and workable, but there are some issues I think could have been done better.

Single inheritance seems like a step backward. The linearization style of multiple inheritance has proven helpful in practice, and it eliminates the need for a separate "interface" or "protocol" feature. Perhaps designers feel like C++'s multiple inheritance went badly, and are avoiding multiple inheritance like the plague? I used to think that way, but it's been multiple decades since C++'s core design. There are better design for multiple inheritance nowadays.

Swift doesn't appear to include persistent data structures. This is the one feature I miss the most when I don't get to program in Scala, and I don't know why it isn't catching on more widely in other languages. Developers can add their own collection types, but since the new types aren't standard, you end up having to convert to standard types whenever you call into another library.

The automatic immutability of collections assigned to constants looks driven by the lack of persistent collections. It's better to support both features independently: let variables be either mutable or not, and let collections be mutable or not. All four combinations are very useful.

Deinitialization, also known as finalization, looks like a throwback to me. In a system with automatic memory management, you don't want to know precisely when your memory is going to get freed. As such, you can't count on deinitializers running soon enough to be useful. Thus, you always need a fallback plan of deallocating things manually. Once you deallocate manually, though, deinitializers become just a debugging technique. It's better to debug leaks using a tool than with a language feature.

In-out parameters seem like a step backwards. The trouble is that most functions use only in parameters, so when you see a function call, a programmer's default assumption is that the callee will not modify the argument. It can lead to bad surprises if the parameter gets modified at all. Out parameters are so unusual that it's better to be explicit about them, for example by taking a mutable collection as an argument.

Custom precedence (and associativity) is likely to go badly. We discussed this in detail, over the course of days, for X10, because X10 is a scientific language that really benefits from a rich set of operators. One problem with user-defined precedence is that it's hard to scope: you want to scope the operators themselves, not their implementations, because parsing happens before method lookup. It's also tough on programmers if they have to learn a new precedence table for every file of code they read. All in all, we concluded that Scala had a reasonable set of trade offs here: have a built-in precedence table with a huge number of available operators, and make library designers simply choose from the existing operators.

I see no exceptions, which is likely to be a nuisance to programmers if they are truly missing. Sometimes you want to tear down a whole chunk of computation without exiting the whole program. In such cases, exceptions work very well. Maybe I just missed it.

Integer types are hard to get right, and I am not sure Swift has chosen a great approach. It's best to avoid unsigned types, and instead to have untyped operations that can apply to typed integers. It's also best to avoid having low-precision operations, even if you have low-precision storage. Given all of the above, you don't really need explicit conversions any more. Java's integer design is quite good, with the exception of the unnecessary char type that is not even good for holding characters. I suspect many people overlook this about Java, because it's a case where programmers are better off with a language with fewer features.

Saturday, January 18, 2014

Is Internet access a utility?

I forwarded a link about Network Neutrality to Google Plus, and it got a lot of comments about how Internet access should be treated like a utility. I think that's a reasonable perspective to start with. What we all want, I think, is to have Internet access itself be a baseline service, and that Internet services on top of it have fierce competition.

In addition to considering the commonalities with Internet access and utilities, we should also note the differences.

One difference is that a utility is for a monopoly, but Internet access is not monopolized. You can only put one road in any physical location, and I will presume for the sake of argument that you don't want to have multiple power grids in the same locale. Internet access is not a monopoly, though! At least in Atlanta, we have cable, DSL, WiMax, and several cellular providers. We have more high-speed Internet providers than supermarket chains.

Another difference is that utilities lock down technology change to a snail's pace. With roads and power grids, the technology already provides near-maximum service for what is possible, so this doesn't matter. With telephony, progress has been locked down for decades, and I think we all lost out because of that; the telephone network could have been providing Skype-like services a long time ago, but as a utility they kept doing things the same way as always. Meanwhile, the Internet is changing rapidly. It would be really bad to stop progress on Internet access right now, the way we did with telephony several decades ago.

I believe a better model than utilities would be supermarkets. Like Internet providers, supermarkets carry a number of products that are mostly produced by some other company. I think it has gone well for everyone that supermarkets to have tremendous freedom in their content selection, pricing, promotional activities, hours, floor layout, buggies, and checkout technology.

In contrast to what some commenters ask, I do not have any strong expectation about what Comcast will or won't try. I would, however, like them to be free to experiment. I've already switched from Comcast and don't even use them right now. If Comcast is locked into their current behavior, then that does nothing for me good or bad. If they can experiment, maybe they will come up with something better.

In principle, I know that smart people disagree on this, but I currently don't see anything fundamentally wrong with traffic shaping. If my neighbor is downloading erotica 24/7, then I think it is reasonable that Comcast give my Game of Thrones episode higher priority. The fact that Comcast has implemented this badly in the past is troubling, but that doesn't mean the next attempt won't work better. I'd like them to be free to try.

Lex Spoon