Wednesday, December 28, 2011

All software has bugs

Johm Carmack has a great article up on his experience with bug-finder software such as Coverity and PC-Lint. One of his observations is this:
The first step is fully admitting that the code you write is riddled with errors. That is a bitter pill to swallow for a lot of people, but without it, most suggestions for change will be viewed with irritation or outright hostility. You have to want criticism of your code.
He feels that the party line for bug finders is true, that you may as well catch the easy bugs:
The value in catching even the small subset of errors that are tractable to static analysis every single time is huge.
I agree. One of the ways it is easier to talk to more experienced software developers is that they take this view for granted. When I talk to newer developers, or to non-engineers, they seem to think that if we spend enough time on something we can remove all the bugs. It's not possible for any body of code more than a few thousand lines. Removing bugs is more like purifying water. You can only manage the contaminants, not remove them. Thus, software quality should be thought of from an effort/reward point of view.

I also have found the following to be true:

This seems to imply that if you have a large enough codebase, any class of error that is syntactically legal probably exists there.
An example I always come back to is the Olin Shivers double word finder. The double word finder scans a text file and detects occurrences of the same word repeated twice in a row, which is usually a grammatical mistake in English. I have started running it on any multi-page paper I write, and it almost always finds at least one such instance that is legitimately an error. If an error can be made, it will be, so almost any automatic detector is going to find real errors. Another one that jives with me is:
NULL pointers are the biggest problem in C/C++, at least in our code.
I did a survey once of the forty most recently fixed bugs on the Squeak bug tracker, and I found that the largest single category of bugs was a null dereference. They were significantly higher than type errors, bugs where one type (e.g. string) was used where another was intended (e.g., open file).

I do part ways with Carmack on the relative value of bug finders:

Exhortations to write better code plans for more code reviews, pair programming, and so on just don’t cut it, especially in an environment with dozens of programmers under a lot of time pressure.
If we were to candidly rank methodology for improving quality, I'd put write better code above use bug finders. In fact, I'd put it second, right after regression testing. I could be wrong, but my intuition is that there are a number of low-effort ways to improve software before it is submitted, and the benefits are often substantial. Things like use a simpler algorithm and read your diff before committing add just minutes to the time for each patch but often save over an hour of post-commit debugging from a repro case.

All in all it's a great read on the value of bug finding tools. Highly recommended if you care about high-quality software. HT John Regehr.

Saturday, December 17, 2011

Blizzard embraces pseudonyms

Blizzard Software's lets you use the same name on multiple games and on multiple servers within the same game. Historically, they required you to use a "real name" (in their case, a name on a credit card). This week they announced that they are deploying a new system without that requirement:
A BattleTag is a unified, player-chosen nickname that will identify you across all of Battle.net – in Blizzard Entertainment games, on our websites, and in our community forums. Similar to Real ID, BattleTags will give players on Battle.net a new way to find and chat with friends they've met in-game, form friendships, form groups, and stay connected across multiple Blizzard Entertainment games. BattleTags will also provide a new option for displaying public profiles.[...] You can use any name you wish, as long as it adheres to the BattleTag Naming Policy.
I am glad they have seen the light. There are all sorts of problems with giving away a real [sic] name within a game.

From a technical perspective, the tradeoffs they choose for the BattleTag names are interesting and strike me as solid:

If my BattleTag isn't unique, what makes me uniquely identifiable? How will I know I'm adding the right friend to my friends list? Each BattleTag is automatically assigned a 4-digit BattleTag code, which combines with your chosen name to create a unique identifier (e.g. AwesomeGnome#3592).
I'll go out on a limb and assume that the user interfaces that use this facility will indicate when you are talking to someone on your friends list. In that case, the system will be much like a pet names system, just with every name including a reasonable default nickname. When working within such UIs, they will achieve all of Zooko's Triangle. When working outside it, the security aspect will be weaker, because attackers can make phony accounts with a victim's nickname but a different numeric code. That's probably not important in practice, so long as all major activities happen within a good UI such as one within one of Blizzard's video games.

Regarding pseudonymity, I have to agree with the commenters on the above post. Why not do it this way to begin with and not bother with RealID? They can still support real [sic] names for people who want them, simply by putting a star next to the names of people whose online handle matches their credit card. Going forward, now that they've done this right, why not simply scrap RealID? It looks like high-level political face cover. You have to read closely in the announcement even to realize what they are talking about.

Thursday, December 1, 2011

Joshua Gans on ebook lending

Our approach to copyright is outdated now that we have a wide-spread Internet. What should we do? Joshua Gans proposes an approach based on lending and on tracking usage:
If lending is the appropriate mode for books, then how would the business of publishing look if it is built around lending rather than ownership? So here is my conjecture. All books are read on devices. Imagine that each device has built in a means of tracking what people read and how much. Imagine that it can also do this in a manner that respects privacy. Then the model I have in mind would allow publishers to receive money based on how much of a book people read and to price that at will.

I like the idea. One point of comparison is to the way radio works. In radio, the content is not DRMed, and you don't pay for each song you listen to. Instead, you subscribe in bulk to content and then flip around to whatever you feel like listening to. There are a variety of specific payment schemes on both sides of the arrangement. For the customer, I've encountered payment based on public taxes (Switzerland), by subscription (Sirius Radio), and by listening to ads (broadcast in the U.S.).

For the content producers, I am less clear about what contracts are out there. At least indirectly, however, they are paid more when there are more users listening to them. I imagine that radio has the same sort of marketing research that television does, and that radio stations know how many people are listening to their station and at what times. They then, through mechanisms that are probably kludgy, buy more of the popular music and less of the unpopular music.

It's a good idea, and I would be happy for it to catch on. Copies are trivial to make, nowadays, so the only ways to control copies are rather draconian. Far better to put a good society first and then find business models that work with it.