Re: The Future of Tagging

Vik Singh has an interesting blog post on the future of tagging. IMO not so much because of the idea, since it looks quite familiar to the directory structure we’re all familiar with, adding the possibility for objects to be inside several directories at the same time. But it got me thinking on what tagging lacks, which is an easy to use relation to more structured data about what the tag means.

Anyone who’s used the tagging interface provided by their bookmarklets or Firefox extension knows how easy they are to use. Just click any tag, and it’s added or removed, based on whether it’s already used. Click the “save” button when done. Dead easy.

Creating RDF, when compared with, is quantum theory. But it’s already being used, and will probably be one of the biggest players in the semantic web, where things have meanings which can be interpreted by computers. Using RDF, you can distinguish between the tag “read” as an imperative (read it!) and an assertation (has been read). You can also make the computer understand that “examples” is the plural of “example”, and that curling is a sport (though some may disagree :)).

How could we combine the two? Here’s an idea: When clicking any of the tags in the tagging interface, you’ll be asked what you mean, by getting the possibility to select any number of from a list of one-sentence meanings. E.g., when selecting “work”, you could get these choices:

  • Item on my todo list
  • Something you’ve worked on
  • Something somebody else has worked on
  • A job/position
  • None of the above
  • Show all meanings
  • Define new meaning…

The list would normally only contain the most popular definitions, to harness the power of the “best” meanings, as defined by the number of users. The “Show all meanings” link could be used to show the whole range of meanings people have defined.

“Define new meaning…” could give you a nice interface to define the meaning of the word in the context of the link you’re tagging at the moment. This is where the designers really have to get their minds cooking to get something usable by at least mid-range computer literates.

More practical HTTP Accept headers

Isn’t it time for user agents to start reporting more fine grained which standards they support? The HTTP Accept header doesn’t provide enough information to know whether a document will be understood at all, and can lead to quite a few hacks, specially on sites using cutting edge technology, such as SVG or AJAX.

For an example, take a look at the correspondence between the Firefox Accept header (text/xml, application/xml, application/xhtml+xml, text/html;q=0.9, text/plain;q=0.8, image/png, */*;q=0.5) and the support level reported at Web Devout’s Web browser standards support page or Wikipedia’s Comparison of web browsers.

Say that I want to serve a page with some SVG, MathML, CSS2/3, and AJAX functionality. Each of these requires different hacks to ensure that non-compliant browsers don’t barf on the contents. For SVG and MathML, I can use CSS to put the advanced contents above the replacement images, or use e.g. SVG to provide replacement text. Both methods increase the amount of contents sent to the user agent, and are not really accessible – Non-visual browsers get the same information twice.

For CSS, countless hacks have been devised to make sure sites display the same in different browsers. So the user agent always receives more information than it needs.

AJAX needs to check for JavaScript support, then XMLHttpRequest support, and then must use typeof to switch between JS methods. This can easily triple the length of a script.

What if browsers could negotiate support with the server using e.g. namespace URIs, where these would reference either a standard, part of it, or some pre-defined support level? Poof, SVG 1.1 Tiny 95% supported, CSS 3 10% supported, DOM level 2 80% supported, etc..

Obviously, the Accept header would be much longer, but the contents received could be reduced significantly. Also, I believe it would be easier for developers to use only Accept header switching than learning all the hacks necessary for modern web development.

I don’t really know if this is possible, but maybe this kind of Accept header could be separated into a special HTTP reply. This would contain the URIs of the potential contents, and the user agent would send a new HTTP GET request with the modified Accept header, reporting the support levels.

Note: This post is the same as sent in reply to an email by Allan Beaufour on the www-forms mailing list of W3C. The text has been slightly modified for legibility.

Follow-up: Added a bug report for Firefox and a short Wikipedia article.

Re: Bug #1 in Ubuntu

Ubuntu Linux has become very popular in the open source community in only two years, last confirmed by the 2005 Members Choice Awards. But being number one in a community of computer literates, nerds, and geeks is not enough to gain a significant market share, as shown by their #1 bug, entitled “Microsoft has a majority market share”.

So here’s a couple of cents worth of musings on why open source (and especially Linux) is not much used, what can be done about it, and where it’s headed.

Ignorance, or being stuck with what you’ve got

AKA, “users don’t know that there are alternatives.” This is a popular reason, but I believe it is a bit off the mark: Users don’t know that there exists alternatives which are both free and good. Most people believe there is no such thing as a free lunch, and free (as in speech and beer) sounds dubious to people who have learned not to trust anything on the Internet.

The “family nerd”, the poor soul which has to fix everybody’s computer problems, has become the forerunner in this mission, because of the empirical evidence of security, stability and usability of some brilliant new software. Already it seems Firefox is spreading around like a wildfire, helped by a host of extensions such as AdBlock and SessionSaver, the latest of which will be part of Firefox 2.0.

The trench wars on the legality of file sharing has also sparked a lot of open source development, such as RevConnect (Direct Connect client) and Azureus (BitTorrent client). These can easily replace the old way of having to jump through elaborate web page hoops to download free software and media files, from web pages you don’t know whether to trust or not. With P2P software, trust is like viral marketing: If something is popular (i.e., many people are sharing / downloading the same files), it’s probably good stuff. That’s also part of the beauty of open source: Since the program and source code is scrutinized by many, any malicious behavior by the developers is likely to be detected, discussed, and resolved, either by a forking of the code or by massive abandon by the users. Closed vs. open source security is a much too big discussion to take on here, suffice to say that both camps have released software with horrible bugs.

Laziness, or work space pragmatics

AKA, “users don’t want to learn something new.” Also not completely true: Users are, for the most part, primarily interested in getting the work done. If Microsoft Office was as broken as Internet Explorer, users would be downloading in droves, but as it turns out, current (and older) word processing and email programs are more than adequate for the common user. Hell, I could probably be using WordPerfect at work without anybody noticing.

When your primary goal is to get a bunch of bulleted lists in a presentable format in two hours, it is hardly pragmatic to start thinking about whether you’ll be able to view that same presentation in five years, or convert it into a PDF file in a flash. And from what I’ve heard and read, a lot of people seem to think that keeping to the same level as everybody else is the optimum. Hacking the system to make it conform to your work style will invariably break it sometimes, and that is likely to be more visible than the x% overall productivity gain.

Openness not usually considered, don’t want to “fiddle” with system settings; developers don’t want to put in the extra effort to make software cross-platform

Stupidity, or not thinking (far) ahead

Except among nerds and geeks, the freedom of open source software seems to be ignored. It’s difficult to find good analogies, but that won’t stop me from trying: Food. “Closed source food” would be sold without ingredient or nutrition descriptions (source code & bug databases), it would be physically addicting (vendor lock-in mechanisms), and you would not be allowed to share it, take it with you to a new kitchen or eat it at work (license).

Other points, which are not inherent in closed source, but seem to be the norm in the industry: It would be sold in sealed boxes (retail distribution), nobody actually producing it would be available for help (customer support), for the most part you would have to use special cutlery (OS / hardware) for it, and every so often your recipes incorporating this food would go haywire (incompatible upgrades, bad standards support).

Open source food, on the other hand, would come with a recipe for how to make your own (source code), virtually no expiration date (license allows for porting), and often free help with your recipes (community support).

OpenDocument is a great example of the superiority of open source: If you uncompress an Writer file, you’ll find the original images, application settings, contents, and text styles in separate files. If you want to change some detail in the text or an image, you can manipulate the file in a suitable editor, and then compress the files again to have a fully working document. Most of the files are XML, so you can also automate data collection or manipulation in several documents easily. No such luck with Microsoft Word: Everything is saved in a proprietary, secret, binary format, which the team has used years to decipher, in order to be able to import MS Office documents.