Monday, January 31, 2011

That scary Internet

National governments are coming to fear the Internet as a potentially disruptive mechanism for their publics. China and Australia have installed national firewalls to attempt to filter information crossing their borders to and from the greater Internet. Most recently, Egypt has recently shut down portions of its Internet infrastructure.

Many reports speak of the Egyptian shut down as a done deal. However, this is a misleading viewpoint. In point of fact, many Egyptians are still connected to the Internet through various means. The Internet is architected so that packets can take any route available from their source IP to their destination IP. As the old saying goes, "The Net interprets censorship as damage and routes around it". Like with so many other things, an official shut down just shuts down official business. Criminals don't care, nor do most of the general public.

Regarding the American kill switch, I must wonder how the discussion has gotten as far as it did given American politics. Aside from being technically hopeless, and for making times of peace more dangerous, it just doesn't seem American to let the president shut down a major category of speech. Has there ever been a U.S. president that tried to get a media kill switch, i.e. the ability to shut down every newspaper, pamphlet, printer, and copying machine at the press of a button?

Overall, I expect this gradual creeping oversight to know no bounds. The U.S. government is ham-handed, its members would universally prefer not to be discussed, and units such as the FCC are seeking a new reason to exist. Instead of gradually fighting each individual effort as they attempt to chip away at the open Internet, I would prefer a categorical principle that the U.S. government just does not have authority over the Internet. There's no reason they should, and they're not even competent.

Thursday, January 6, 2011

Software patents help what, again?

Via James Robertson, I read that Interval is suing about a dozen major software companies over patent infringement. I am having trouble finding an original link to the case information, but here's a link to one copy of Interval's opening volley.

Here's the IP Interval is suing over:
The ’507 patent describes an invention that enables a user to efficiently review a large body of information by categorizing and correlating segments of information within the body of information and generating displays of segments that are related to the primary information being viewed by the user.
From this alone, you might thing they have some advanced technique for categorizing and showing related information. No, they really are claiming that the whole idea of showing users a list of items related to the one they are looking at is an Interval invention. For example, here is their complaint about eBay:
Defendant eBay has infringed and continues to infringe one or more claims of the ’507 patent under 35 U.S.C. § 271. eBay operates the eBay.com and Half.com websites, which provide content such as product listings and advertisements to users. In order to help users find additional content that may be of interest, the software and hardware that operate these websites compare the available content items to determine whether they are related. When a user views a particular content item, the eBay.com and Half.com websites generate displays of related content items so as to inform the user that the related items may be of interest. For example, as demonstrated by Exhibit 8, when a user views a particular product listing on eBay.com, the eBay.com website displays both the selected product information (identified by the orange box) and links to other related products (identified by the green boxes). The hardware and software associated with the eBay websites identified above and any other eBay websites that perform this function infringe at least claims 20, 21, 22, 23, 24, 27, 28, 31, 34, 37, 63, 64, 65, 66, 67, 70, 71, 74, 77, and 80 of the ’507 patent under 35 U.S.C. § 271.


The theory behind patents is that, without patent protection, nobody would have invented the idea in question. By offering patent protection, companies will devote resources to research that they otherwise would not have. Can anyone seriously believe, however, that we would have more innovation if all of AOL, Apple, eBay, Facebook, Google, Netflix, Office Depot, OfficeMax, Staples, Yahoo, or YouTube had honored this patent and not shown similar items on their web sites? Does anyone believe that if Interval hadn't "invented" this idea, that nobody else would have?

An additional part of the rationale for patents is that the idea are difficult to develop, that they would only emerge if significant private resources were dedicated to its research. That, too, is hard to believe for this idea. How long did it take the guys at Interval to come up with this idea? Five minutes, maybe?

I have an idea how to stimulate the software industry. Stop issuing software patents.

Monday, January 3, 2011

References without page numbers

James Robertson asks how we can reference a part of a book, if we read the book on a Kindle or other electronic medium:
...what does a page number even mean? It should be simple to graft the physical form page number into the metadata, but as we go forward, there may well be books for which no physical form exists. What then?
This isn't a new problem, but it's exacerbated by current norms of book publishing. Printed books often don't number their sub-entities at a finer grained level than chapters, so if you don't have the physical version in front of you, all you can cite is the chapter. Worse, if someone else has a physical version, and you're reading the electronic version, it's problematic if they give you a cite for a page number.

It's an old problem, though, and it has a lot of old solutions. It comes up any time the same text is printed multiple times with different page numbers. Two examples would be codes of law and the Christian Bible. If you want to cite a part of one of these, it's poor form to use a page number, because that page number is only valid for a specific printing. You instead make reference to the detailed numbers that have been applied to the sub-entities of the text.

Going forward, it would help if books started containing more fine-grained numberings as a matter of course. In theory we could instead use character count or word counts, but that has two problems. It is prone to differences in convention, e.g. how many characters is a paragraph indent, and how many words are in counter-revolutionary. Worse, it doesn't work well for people using the print version, who would need a specially printed version with the position counts on the bottom of each page or in the margins.

Bill Venners foresaw this problem for Programming in Scala, and he was careful to publish the ebook version such that it has the exact same page numbers as the printed book. This is possible because the ebook is a PDF file, and PDF files have the same pagination on every device. In addition to the consistent page numbering, the book includes fine-grained number of all the sections, figures, tables, and larger programming listings, so you can also cite things that way. In short, feel free to copiously cite parts of Programming in Scala. Don't worry about the ebook readers--they'll be able to look up your references just fine.