Wednesday, March 7, 2012

Posner on digital copyright

Richard Posner takes on digital copyright:
The importance of copyright, and hence the negative consequences of piracy for the creation of new works, are, however, often exaggerated. Most of the world’s great literature was written before the first copyright statute, the Statute of Ann, enacted in 1710. [...] Copyright law needs to be adapted to the online revolution in distribution.

Posner has a radical suggestion that I believe would work out just fine:
So, were Google permitted to provide complete online access to all the world’s books, in their entirety, the gain in access might more than offset the loss in authors’ royalties.

Posner justifies his claim by considering the increase in creativity and in creative works that would result.

I would further justify such a policy by considering what it is going to take to protect copyright in its current form. SOPA, PROTECT-IP, ACTA, and the DMCA are all based on controlling copies. I have little doubt that measures like them will succeed over time and grow stronger. The main way to fight them is more fundamental. Stop trying to prevent copies--which is impossible--and focus more on other revenue models. The models don't even have to be designed as a matter of public policy. Simply remove the props on the old-fashioned models, and make room for entrepreneurs to search for new models.

Wednesday, January 25, 2012

The good and bad of type checking, by example

I enjoyed watching Gilad Bracha present Dart to a group of Stanford professors and students. As one might expect, given Bracha's background, he spends considerable time on Dart's type system.

Several members of the audience seemed surprised to find a defender of Dart's approach to typing. They understand untyped languages such as JavaScript, and they understand strictly typed languages such as Java. However, they don't understand why someone would intentionally design a language where types, when present, might still fail to hold up at run time.

One blog post will never convince people one way or another on this question, but perhaps I can show the motivation and dissipate some of the stark mystification around Dart's approach. Let me provide two examples where type checking would complain about a program. Here's the first example:

String fileName = new File("output.txt");

I find examples like this very compelling. The programmer has made an easy mistake. There's no question it is a mistake; this code will always fail when it is run. Furthermore, a compiler can easily detect the mistake simply by assigning a type to each variable and expression and seeing if they line up. Examples like this make type checking look really good.

On the other hand, consider this example:

void drawWidgets(List<Widget> widgets) { ... }
List<LabelWidget> labels = computeLabels();
drawWidgets(labels);

This program is probably fine, but a traditional type checker is forced to reject it. Even though LabelWidget is a subclass of Widget, a List<LabelWidget> is not a subtype of List<Label>, so the function call in the last line will not type check. The problem is that the compiler has no way of knowing that drawWidgets only reads from its input list. If drawWidgets were to add some more widgets to the list, then there would be a type error.

There are multiple ways to address this problem. In Java, programmers are expected to rewrite the type signature of drawWidgets as follows:

void drawWidgets(List<? extends Widget> widgets) { ... }
In Scala, the answer would be to use an alternate List type that is covariant in its argument.

Whatever the solution, it is clear that this second example has a much different impact on developer productivity than does the first one. First of all, in this second example, the compiler is probably wrong, and it is just emitting an error to be on the safe side. Second, the corrected version of the code is much harder to understand than the original; in addition to parametric types, it also uses an bounded existential type variable. Third, it raises the bar for who can use this programming language. People who could be perfectly productive in a simply typed language will have a terrible time with quantifier-happy generic Java code. For a host of reasons, I feel that on net the type checker makes things worse in this second example. The cases where it prevents a real error are outweighed by all the problems.

Dart's type system is unusual in that it is consistent with both examples. It rejects code like the first example, but is quiet for code like the second one.

Sunday, January 22, 2012

DNS takedowns alive and well

I wrote earlier that PROTECT-IP and SOPA are getting relatively too much attention. Specifically, I mused about this problem:
First, DNS takedowns are already happening under existing law. For example, the American FBI has been taking down DNS names for poker websites in advance of a trial. SOPA and PROTECT-IP merely extend the tendrils rather than starting something new.

Today I read news that indeed, the FBI has taken down the DNS name for Megaupload.com. I'm not sure the American public is in tune with precisely what its federal government is doing.

The news has other sad aspects than the use of DNS takedowns. A few other aspects lept out for me:

  • There has been not yet been a trial. If I ask most Americans about how their legal system works, I expect one of the first things people would say is that, in America, people are innocent until proven guilty.
  • There is twenty years of jail time associated with the charges. Isn't that a little harsh for copyright violations? I think of jail as how you penalize murderers, arsonists, and others who are going to be a threat to the public if they are left loose. Intellectual property violations somehow seem to not make the cut.
  • It's an American law, but New Zealand police arrested some of the defendants.
  • The overall demeanor of the authorities comes off as rather thuggish. For example, they seized all manner of unrelated assets of the defendants, including their cars.
I am glad SOPA and PROTECT-IP went down. However, much of what protesters complained about is already happening.

Monday, January 2, 2012

DNS takedowns under fire in the U.S.

I get the impression that SOPA, the latest version of a U.S. bill to enable DNS takedowns of non-American web sites, is under a lot of pressure. A major blow to its support is that the major gaming console companies backing out.

I am certainly heartened. However, the problem is still very real, for at least two reasons.

First, DNS takedowns are already happening under existing law. For example, the American FBI has been taking down DNS names for poker websites in advance of a trial. SOPA and PROTECT-IP merely extend the tendrils rather than starting something new.

Second, this bill won't be the last. So long as the Internet uses DNS, there is a vulnerability built right into the protocols. Secure DNS doesn't make it any better; on the contrary, it hands the keys to the DNS over to national governments.

The only long term way to fix this problem is to adjust the protocols to avoid a single point of vulnerability. It requires a new way to name resources on the Internet.

Wednesday, December 28, 2011

All software has bugs

Johm Carmack has a great article up on his experience with bug-finder software such as Coverity and PC-Lint. One of his observations is this:
The first step is fully admitting that the code you write is riddled with errors. That is a bitter pill to swallow for a lot of people, but without it, most suggestions for change will be viewed with irritation or outright hostility. You have to want criticism of your code.
He feels that the party line for bug finders is true, that you may as well catch the easy bugs:
The value in catching even the small subset of errors that are tractable to static analysis every single time is huge.
I agree. One of the ways it is easier to talk to more experienced software developers is that they take this view for granted. When I talk to newer developers, or to non-engineers, they seem to think that if we spend enough time on something we can remove all the bugs. It's not possible for any body of code more than a few thousand lines. Removing bugs is more like purifying water. You can only manage the contaminants, not remove them. Thus, software quality should be thought of from an effort/reward point of view.

I also have found the following to be true:

This seems to imply that if you have a large enough codebase, any class of error that is syntactically legal probably exists there.
An example I always come back to is the Olin Shivers double word finder. The double word finder scans a text file and detects occurrences of the same word repeated twice in a row, which is usually a grammatical mistake in English. I have started running it on any multi-page paper I write, and it almost always finds at least one such instance that is legitimately an error. If an error can be made, it will be, so almost any automatic detector is going to find real errors. Another one that jives with me is:
NULL pointers are the biggest problem in C/C++, at least in our code.
I did a survey once of the forty most recently fixed bugs on the Squeak bug tracker, and I found that the largest single category of bugs was a null dereference. They were significantly higher than type errors, bugs where one type (e.g. string) was used where another was intended (e.g., open file).

I do part ways with Carmack on the relative value of bug finders:

Exhortations to write better code plans for more code reviews, pair programming, and so on just don’t cut it, especially in an environment with dozens of programmers under a lot of time pressure.
If we were to candidly rank methodology for improving quality, I'd put write better code above use bug finders. In fact, I'd put it second, right after regression testing. I could be wrong, but my intuition is that there are a number of low-effort ways to improve software before it is submitted, and the benefits are often substantial. Things like use a simpler algorithm and read your diff before committing add just minutes to the time for each patch but often save over an hour of post-commit debugging from a repro case.

All in all it's a great read on the value of bug finding tools. Highly recommended if you care about high-quality software. HT John Regehr.

Saturday, December 17, 2011

Blizzard embraces pseudonyms

Blizzard Software's lets you use the same name on multiple games and on multiple servers within the same game. Historically, they required you to use a "real name" (in their case, a name on a credit card). This week they announced that they are deploying a new system without that requirement:
A BattleTag is a unified, player-chosen nickname that will identify you across all of Battle.net – in Blizzard Entertainment games, on our websites, and in our community forums. Similar to Real ID, BattleTags will give players on Battle.net a new way to find and chat with friends they've met in-game, form friendships, form groups, and stay connected across multiple Blizzard Entertainment games. BattleTags will also provide a new option for displaying public profiles.[...] You can use any name you wish, as long as it adheres to the BattleTag Naming Policy.
I am glad they have seen the light. There are all sorts of problems with giving away a real [sic] name within a game.

From a technical perspective, the tradeoffs they choose for the BattleTag names are interesting and strike me as solid:

If my BattleTag isn't unique, what makes me uniquely identifiable? How will I know I'm adding the right friend to my friends list? Each BattleTag is automatically assigned a 4-digit BattleTag code, which combines with your chosen name to create a unique identifier (e.g. AwesomeGnome#3592).
I'll go out on a limb and assume that the user interfaces that use this facility will indicate when you are talking to someone on your friends list. In that case, the system will be much like a pet names system, just with every name including a reasonable default nickname. When working within such UIs, they will achieve all of Zooko's Triangle. When working outside it, the security aspect will be weaker, because attackers can make phony accounts with a victim's nickname but a different numeric code. That's probably not important in practice, so long as all major activities happen within a good UI such as one within one of Blizzard's video games.

Regarding pseudonymity, I have to agree with the commenters on the above post. Why not do it this way to begin with and not bother with RealID? They can still support real [sic] names for people who want them, simply by putting a star next to the names of people whose online handle matches their credit card. Going forward, now that they've done this right, why not simply scrap RealID? It looks like high-level political face cover. You have to read closely in the announcement even to realize what they are talking about.

Thursday, December 1, 2011

Joshua Gans on ebook lending

Our approach to copyright is outdated now that we have a wide-spread Internet. What should we do? Joshua Gans proposes an approach based on lending and on tracking usage:
If lending is the appropriate mode for books, then how would the business of publishing look if it is built around lending rather than ownership? So here is my conjecture. All books are read on devices. Imagine that each device has built in a means of tracking what people read and how much. Imagine that it can also do this in a manner that respects privacy. Then the model I have in mind would allow publishers to receive money based on how much of a book people read and to price that at will.

I like the idea. One point of comparison is to the way radio works. In radio, the content is not DRMed, and you don't pay for each song you listen to. Instead, you subscribe in bulk to content and then flip around to whatever you feel like listening to. There are a variety of specific payment schemes on both sides of the arrangement. For the customer, I've encountered payment based on public taxes (Switzerland), by subscription (Sirius Radio), and by listening to ads (broadcast in the U.S.).

For the content producers, I am less clear about what contracts are out there. At least indirectly, however, they are paid more when there are more users listening to them. I imagine that radio has the same sort of marketing research that television does, and that radio stations know how many people are listening to their station and at what times. They then, through mechanisms that are probably kludgy, buy more of the popular music and less of the unpopular music.

It's a good idea, and I would be happy for it to catch on. Copies are trivial to make, nowadays, so the only ways to control copies are rather draconian. Far better to put a good society first and then find business models that work with it.