Review: Liars and Outliers by Bruce Schneier

tl;dr An enormously important book about understanding and optimizing security in the 21st century.

On the Internet, nobody knows you’re a dog. I don’t know Bruce Schneier, and he certainly doesn’t know me. Even so, when he announced a heavily discounted signed edition of Liars and Outliers he was effectively testing the main hypothesis of the book: That in any society it is reasonable to uphold a non-zero level of trust even in complete strangers:

  • Schneier trusted 100 (or at least many enough to make a net gain) random strangers to reciprocate the offer by writing and publishing a review of the book.
  • 100 random people trusted him to sign copies of the book and send it to the correct addresses upon receipt of the money.
  • All 101 of us trusted essentially the rest of the human race not to interfere in the transaction, even when interference could mean easy money with virtually no chance of retribution.

Schneier goes on to explain, with his famous lucidity and reference to much contemporary research, why this trust is essential to all human interchange, how trustworthiness is highly dependent on the situation and not just the person, how a society with 100% conformity is not just a terrible goal but literally impossible, the human and artificial pressures to cooperate or not, how more severe punishments are often ineffective or even counter-effective, and how social and technological evolution is too fast for democracy to stabilize the overall level of trust.

[At this point I wanted to double-check the scribbled-down criticisms below, but the book is 3,000 km away with a nephew. Please take the following with a grain of salt. And now that I’ve lowered your expectations, let’s continue!]

In some very few places I found the wording misleading. For example, the iTunes store doesn’t allow you to buy music, merely to license it for your personal use. As far as I understand from what very little I’ve read of this, when iTunes shuts down, there are many jurisdictions where you would not be allowed to download songs which are audibly indistinguishable from what you had paid for.

The graphs are generally informative, but sometimes confusing. For example (pages 72-73):

  • Traits/Tendencies and natural defenses are both in the social pressures box, while the text says neither is a social pressure.
  • There’s an incentives line and a separate box.
  • Why are some of the lines double? If they’re strong, a thick line would be clearer.

One note is terrifying: On average, 7% of terrorists’ policy objectives are achieved? What method could conceivably be considered more effective than 7% for a (usually) tiny group of what is often foreigners? Compare it to normal bureaucratic channels, where usually only billionaire citizens or corporations have the slightest chance to change policy within a reasonable time.

Conclusion: I wish this had been compulsory reading at high school. With entertaining anecdotes, scary implications of human nature, and scientifically grounded careful optimism it’s the most dangerous book everyone should read.

Social contract – Fulfilled!

EIF replies

In response to Glyn Moody’s Open Source and Open Standards under Threat in Europe, here are the open replies to the key people (I’ll post as they are sent).

Joaquín Almunia:

Dear sir,

I have just read some disconcerting news and opinions regarding the EIF process (“Open Source and Open Standards under Threat in Europe” by Glyn Moody), and I hope you have the time to include the opinions of a software developer in your deliberations.

I have been working in private companies and the European Organization for Nuclear Research (CERN) since my graduation in 2004. I am also an active web user and contributor. This activity has taught me a few important business lessons:
1. Open source software and data based on open standards* are much more robust in the face of change than the alternative. Software is evolving fast, but if the proprietary software provider is unwilling or unavailable to make new software work with old data, the only options left are a costly and difficult re-implementation, a costly and difficult (often impossible because of data complexity) migration to other software, or outright abandonment.
2. Closed source means re-inventing the wheel over and over. Software business should be about creating additional value on top of what already exists, not about costly reiterations of what already exists.
3. With the availability of cheap Internet connectivity, storage and computing power comes the opportunity for individuals and communities to make millions of incremental improvements to software every day. These updates are available to anyone else, making for an enormous amount of work provided for free for anyone to build upon or profit from.

* I.e., software / standards which are available for free for anyone to view, modify and re-publish, optionally with additional restrictions or permissions such as the opportunity to change permissions on derivative works or the need for source attribution.

This text, and other appeals, will be available shortly at l0b0.wordpress.com/2010/03/29/eif-replies/

Just received a reply. The gist:

Recently, “draft versions” of the revised EIF have apparently been published on the Internet and we understand that you refer to these draft versions. You should note that the Commission cannot comment on such draft versions as they do not reflect a formal Commission position. But let me assure you that the guiding principles for the revision of the EIF include technological neutrality and adaptability, openness and reusability, as specified in the legal base of the Programme “Interoperability Solutions for European Public Administrations” (ISA)2, in the context of which the revision is being carried out.

Throw the fucking switch, Igor!

What would you like to have said, if it were you behind the “big green button” when the LHC starts in 2008? It’s the world’s most powerful particle accelerator, said to be the most complex machine ever built, and will most likely set the stage for the next level of theoretical physics, so it had better be in the “One small step” category.

Job trends in web development

The job search service Indeed has an interesting “trends” search engine: It visualizes the amount of job postings matching your keywords the last year. Let’s see if there is some interesting information for modern web technologies there…

XHTML vs. HTML

The relation between XHTML and HTML Relative popularity of XHTML and HTML in job offers could be attributed to a number of factors:

  • XHTML is just not popular yet (1 Google result for every 19 on HTML).
  • The transition from HTML to XHTML is so simple as to be ignored.
  • The terms are confused, and HTML is the most familiar one.
  • XHTML is thought to be the same as HTML, or a subset of it.

The XHTML graph alone Popularity of XHTML in job offers could give us a hint as to where we stand: At about 1/100 of the “popularity” of HTML, it’s increasing linearly. At the same time, HTML has had an insignificant increase, with a spike in the summer months (it is interesting to note that this spike did not occur for XHTML). XHTML could be posed for exponential growth, taking over for HTML, but only time will tell.

AJAX

This is an interesting graph Popularity of AJAX in job offers: It grows exponentially, which is likely to be a result of all the buzz created by Google getting on the Web 2.0 bandwagon. Curiously, the growth rate doesn’t match that of the term “web 2.0” Relative popularity of AJAX and "Web 2.0" in job offers. Attempting to match it with other Web 2.0 terms such as “RSSRelative popularity of AJAX and RSS in job offers, “JavaScript” Relative popularity of AJAX and JavaScript in job offers, and “DOMRelative popularity of AJAX and DOM in job offers also failed. The fact that AJAX popularity seems to be irrelevant to Web 2.0 and even JavaScript popularity is interesting, but I’ll leave the creation of predictions from this as an exercise for the readers. :)

CSS

While insignificant when compared to HTML Relative popularity of HTML and CSS in job offers, the popularity of CSS closely follows that of XHTML Relative popularity of XHTML and CSS in job offers. Based on that and the oodles of best practices out there cheering CSS and XHTML on, I predict the following: When CSS is recognized for its power to reduce bandwidth use and web design costs, it’ll drag XHTML up with it as a means to create semantic markup which can be used with other XML technologies, such as XSLT and RSS / Atom.

Discussion of conclusions

The job search seems to be only in the U.S., so the international numbers may be very different. I doubt that, however, based on how irrelevant borders are on the Web.

The occurence of these terms will be slowed by such factors as how long it takes for the people in charge to notice them, understand their value / potential, and finally find areas of the business which needs those skills.

Naturally, results will be skewed by buzz, large scale market swings, implicit knowledge (if you know XHTML, you also know HTML), and probably another 101 factors I haven’t though of. So please take the conclusions with a grain of salt.

My conclusions are often based on a bell-shaped curve of lifetime popularity, according to an article / book I read years ago. I can’t find the source, but it goes something like this:

  1. Approximately linear growth as early adopters are checking it out.
  2. Exponential growth as less tech savvy people catch on; buzz from tech news sources.
  3. Stabilization because of market saturation and / or buzz wearing off.
  4. Exponential decline when made obsolete by other technology.
  5. Approximately linear decline as the technology falls into obscurity.

PS: For some proof that any web service such as Indeed should be taken with a grain of salt, try checking out the result for George Carlin’s seven dirty words Relative popularity of George Carlin's seven dirty words in job offers ;)

Re: The Future of Tagging

Vik Singh has an interesting blog post on the future of tagging. IMO not so much because of the idea, since it looks quite familiar to the directory structure we’re all familiar with, adding the possibility for objects to be inside several directories at the same time. But it got me thinking on what tagging lacks, which is an easy to use relation to more structured data about what the tag means.

Anyone who’s used the del.icio.us tagging interface provided by their bookmarklets or Firefox extension knows how easy they are to use. Just click any tag, and it’s added or removed, based on whether it’s already used. Click the “save” button when done. Dead easy.

Creating RDF, when compared with del.icio.us, is quantum theory. But it’s already being used, and will probably be one of the biggest players in the semantic web, where things have meanings which can be interpreted by computers. Using RDF, you can distinguish between the tag “read” as an imperative (read it!) and an assertation (has been read). You can also make the computer understand that “examples” is the plural of “example”, and that curling is a sport (though some may disagree :)).

How could we combine the two? Here’s an idea: When clicking any of the tags in the del.icio.us tagging interface, you’ll be asked what you mean, by getting the possibility to select any number of from a list of one-sentence meanings. E.g., when selecting “work”, you could get these choices:

  • Item on my todo list
  • Something you’ve worked on
  • Something somebody else has worked on
  • A job/position
  • None of the above
  • Show all meanings
  • Define new meaning…

The list would normally only contain the most popular definitions, to harness the power of the “best” meanings, as defined by the number of users. The “Show all meanings” link could be used to show the whole range of meanings people have defined.

“Define new meaning…” could give you a nice interface to define the meaning of the word in the context of the link you’re tagging at the moment. This is where the designers really have to get their minds cooking to get something usable by at least mid-range computer literates.

The Internet generation

Sometime when I was younger, I heard that generations normally shift with 30 years’ interval. So either I was just on the “old” side of the last shift, or the generations are changing faster now than my parents’ lore would have it.

Yes, I’m talking about the Internet generation. I’m lucky enough to be part of the Nintendo generation, but Internet was outside my scope until high-school. Even then, it seemed a strange and geeky place, and the idea of it’s widespread adoption by “mom’n’dad” didn’t even manifest itself. Now, they are braving the fields of the unknown, if not with enthusiasm, then at least a slight interest. Myself, I’m online as long as my home or work computer is not having a well-earned break.

But still, I’ll never be part of the Internet generation. It is composed by those who do not yet know how to spell, but who know where to click in order to play “snakes and ladders” with their friends online. They will be the first to grow up in a world where the Internet is ubiquitous.

How does this bode for the future of the Internet? Certainly, usability will be an issue when everyone is online. Government and private services will be expected to be available online, with security, accountability, speed, and reliability at higher levels than could ever be obtained by manual work. People will meet each other, exchange digital signatures as easily and naturally as business cards, and use them to ease the possibility for secure message transfer free from spam and phishing attempts. Idle CPU cycles and free storage, which is already astronomical, will be put to use in distributed computing and storage systems, solving research problems and backing up your family photos. Passwords will be replaced by biometrical or other “natural” methods of authentication. Users with little or no expertise in security, or even computers in general, will be able to set up totally secret conversations with others.

As always, the medallion has a backside. Higher levels of security will mean that people put more trust into the systems, and forget that ultimately there are people behind, creating and maintaining them. If perfect security is assumed, the results can be the disastrous when proven otherwise. Technology, like humans, does not perform perfectly. Also, secrecy is a useful tool for criminals. However, I believe this will lead to the use of low-tech solutions for catching them, and a long wanted, real privacy for ordinary citizens. With regard to biometric measurements, there have been concerns that criminals might cut off an organ or limb to get access to a system. However, this can be solved by extending the measurements to the whole body. There is also the issue of psychological damage from the material on the Internet. This issue is discussed in another blog entry.

One thing is for sure: The Internet is here to stay, and it will influence the lives of our children, for better or worse.

Patent suicide

Yeah, I know. Everyone’s writing about it, so this isn’t going to get too original. But I’d like to write some of this down before the Black Monday of patent hogs happens, and the system finally gets the review and revolution it needs.

According to The Register, patent 4,734,690, filed by Tektronix in 1987, covers the display in 2D of a 3D image. Which should cover just about … every FPS since Wolfenstein! McKool Smith, a US legal firm, is now suing numerous companies for violation of this patent. Some of the giants mentioned include Electronic Arts (Sim series), Activision (Doom series), Take Two (Grand Theft Auto series), Ubisoft (Myst, Rainbow Six series), Atari (Civilization series), Vivendi Universal (Half-Life series), Sega (Sonic the hedgehog series), and Lucasarts (Star Wars series).

The case mentioned is in no way unique. There are numerous examples of what seems to be a trend to make money not by the good old-fashion way of actually producing something, but rather by suing others for making something similar to what you have made, maybe decades before (all links go to news articles about lawsuits for patent infringement). Big companies are sued because the legal system of many countries calculate fines according to the size of companies, and smaller companies are sued because many of them will rather go for a settlement than risking bankruptcy in case of a loss in court.

So what is the basis of the problem? Something which popped up while writing this, was that each and every patent is like a law, the main difference being that fines are paid to the patent holder, not the state. I am not familiar with the legal texts of patents, but I would believe that most of them correspond to at least an A4 page of text. With the current amount of patents, that amounts to millions of pages! How are companies supposed to be able to keep à jour with that? The result is that most companies produce and innovate without checking for pre-existing patents for their products, and just hope that they are not interfering with existing patents. In other words, the current patent system is a time-bomb in the face of any company.

So what can be done? Eliminating patents altogether would be extremely unfair towards the innovators, as good ideas would be copied as soon as they are made public. The mandate of patent offices could be extended to check project descriptions and the like for any possible infringements, but errors in this process could create legal chaos. Who would be to blame? Also, it would probably be too expensive to be effectively done by public offices. Stricter patent reviews could be used, but will probably take enormous amounts of time because of the complexity of the legal aspects. How about passing laws making sure some fixed part of the settlement / fine sums goes to the state? Sure, there would be less suing for patent infringement, but this could encourage companies to take even lighter on patent infringement.

There are two measures which, in combination, I believe could solve at least part of the problem. They are based on the assumption that there are two ways to know whether project X will infringe on a patent: Intricate knowledge of any patents in the same and bordering business areas, and actually seeing a product which is very similar to the expected result of the project. The first is problematic because of the enormous amount of work involved to get a legal approval before the project is finished. The resources are always limited, and projects normally evolve from their initial plan. The second I believe to be much easier in the general case: Take a look at finished products, and see if they already share key features with the expected result of the project. Based on this, I propose that either of the following must hold to make company A win a patent infringement case against company B:

  1. Company A must prove that, at the time company B’s product was in the stores, it was planning to produce, already producing, or in the process of selling a product utilizing the patent.
  2. It must be proven that company B somehow knew, or for some obvious reasons, should have known, about the patent.

In other words, unless company A was in the process of making any product based on a patent, it shouldn’t be able to stop other companies from creating such a product. This would make sure that companies cannot buy patents to stifle the production of something revolutionary, thus holding back innovation. Also, if there is no reason to believe that company B knew of the patent, they shouldn’t be punished afterwards.

It should be noted that if company A discovers that company B is infringing on their patent while in the planning, production, or sale period, they should send a “cease and desist” letter to company B, stating the relevant patent number and some kind of indication that planning, production, or sale is in process. If company B chooses to ignore this, it is clear that they have broken point 2 above, and so could be successfully sued.

As a nerd, I also have to suggest a technical solution: RDF, or Resource Description Framework. In short, it can be used to enable computer reasoning about complex, human-related themes. Basically, it defines relationships between generally atomic parts of data, and also about the relationships themselves. The point is that this could possibly be used to formalize and query patents. You could specify key concepts about something you are planning to produce, and the system would (by means of logical inference) return any infringing patents, explaining which parts of the patents are relevant. This is much more than a plaintext search (à la Google) can achieve, because it doesn’t just work in the words themselves, but their meaning.