Whither the Semantic Web

Posted on SemanticWeb Discussion grouo, June 17, 2004

I have been and continue to be a supporter of the Semantic Web. It is, however, an idea that I believe has drifted and with each passing day is becoming more irrelevant. I have enormous respect for the creativity and innovation of those working on it. But I wonder whether they haven't been led astray.

To me, the most successful aspect of the Semantic Web isn't the WWW (because nobody thinks of the WWW as part of the Semantic Web, and simply saying it won't make it so). The most successful part of the Semantic Web is RSS (Rich Site Summary, or Really Simple Syndication). That RSS is successful is beyond dispute; there are millions of sites using RSS worldwide and millions of people reading RSS. But what is ironic is that RSS developed outside the Semantic Web development infrastructure; if the Semantic Web is what the W3C is building (and only that), then RSS is not a part of the Semantic Web.

What's going wrong, it seems to me, is that the developers of the Semantic Web have lost any sense of the idea that these technologies must be used by millions of users, and that for this to happen, the technologies must be (a) capable of being understood by millions of people, and (b) easily implemented by these same people. I think that the Semantic Web is failing on both counts, not because the technologies and concepts are inherently incomprehensible or difficult, but because the descriptions and implementations are.

From where I sit, there are two ways of developing a new technology (a new specification, a new language, whatever):

  • develop a simple core that users can expand if they need to
  • develop a comprehensive system anticipating the way users would expand it

RSS developed the first way. The RSS language is very easy to learn - it has only a half dozen core elements, expressed in the simplest possible XML. An RSS file can be created by any person with a minimum of technical skill. Examples that they can use as templates abound. RSS has been extended in various ways, and various versions have emerged, and they all work. Just like with HTML, you don't have to be letter perfect to a detailed spec to make it work, you need only create valid XML.

But, for a contrast, look at any W3C page describing this or that specification. These pages are difficult to read, the specification is described in a dense, developer-oriented style, examples are hard to come by, and there is no room for error or modifications - it's the W3C's way or the highway (yes, this may not be technically true, but it's the impression a reader gets). The Semantic Web (in capitals) is something hard-edged Java gurus write, which can be used in academic or enterprise systems, but is hardly something you would put on your web page. And it is therefore (to the vast majority of people) useless.

To me, it seems that the W3C and the semantic web people should get back to their roots. It is still not too late to take a leadership position here, though it seems to me that it will require some hard rethinking. The future isn't enterprise systems, proprietary databases, web services, Java runtime engines, or standardized ontologies. That's not what the web was, that's not what RSS is, and that's not the future of online semantics.

Here is some gospel: if you can't do it simply, with a simple text-editor, a web server and a standard browser, it's broken.

The fact is, the vast majority of people out there do not have administrator level access to a web server and database implementation. This means that the complex Java infrastructures, XML databases, application servers, and the like, are out for them. They probably don't have privileges to compile, they certainly cannot install Tomcat. One version of Java was too many for them, while several different and incompatible versions was too much.

More gospel: if you can't say what you want to say with it, it's broken.

Semantic web infrastructure is incredibly complex and expressive. But there's no simple way to describe yourself, except in plain text. The same community that developed RSS has had to develop FOAF, a hack. Why don't we have an RSP (Really Simple People) protocol that people can create in Notepad (or generate with a simple Perl script) and post on their website? It's like the W3C has created a way to Say Everything but haven't given a person a way to say "I exist".

Why can't we describe events? The RSS community is searching for a nice, simple events format, RSE (Really Simple Events) again analagous to RSS, to say that 'This event, which is part of [ref] this event, starts on April 9 at the Grand Hall in Moncton, and ends an hour later.' It would take, what, an hour for one of the W3C gurus to specify and propose? Again, simple Perl that people could hack and modify would make this a powerful technology. But where is it? It appears as though the W3C wants to describe The Unfolding of the Universe, but not Joe's graduation party.

A Third Gospel: if you can't link, it's broken.

The RSS community badly needs a way to link one resource with another. Even if we had RSP, there's no clear and unambiguous way to link the person with, say, the blog entry. Like this, maybe?

<author rdf:resource = http://whwtever.rsp>

Or what? Nobody knows - or at least, if they know, they're not saying. What is The Way to link things? Gosh, this is the core of the Semantic Web and yet is mired in obscurity and mystery. I want a simple page saying here's How to Do It - nothing more, nothing less. Something that leaves me free to link anything with anything, the semantic equivalent in XML to a web link.

A Fourth Gospel: if you can't find it, it doesn't exist.

Now, again, all this may exist. But where? Buried somewhere under the morass that is the W3C's website? Paragraph 47.3.45 in the Resource Description Format technical manual? If any of this exists (which I personally doubt) it should be up front on the W3C site: here's how to get started - describe yourself, describe your resource, link them together. *This* is what we mean by the Seamntic Web. *Here* is some script *you* can run using your limited resources that makes it work.

If the system that evolves from the work on the semantic web is one system for Enterprise, and another for the people, then we are creating an information infrastructure that is fundamentally flawed from the outset. The developers of the Semantic Web must, in my opinion, step back from the drawing boards for a bit, and roll out a basic infrastructure that allows people, real people, to communicate real things in a straightforward manner. The potential is there and people are just waiting for it.

Share |