2009-11-03

The Semantic Web

Excerpts from the article The Semantic Web, by Tim Berners-Lee, James Hendler, and Ora Lassila; published by Scientific American in 2001:

In general, computers have no reliable way to process semantics.

The Semantic Web is an extension of the current Web where information is given well-defined meaning. It should be universal and as decentralized as possible, even at the cost of not having total consistency.

The Semantic Web leverages Knowledge Representation techniques: structured collections of information and sets of inference rules.
Gӧdel's theorem: "Any system that is complex enough to be useful also encompasses unanswerable questions and paradoxes like 'This sentence is false'".
Semantic Web researchers accept that paradoxes and unanswerable questions are a price that must be paid to achieve versatility.

The challenge of the Semantic Web is to provide a language that expresses both data and rules for reasoning about the data and that allows rules from any existing knowledge-representation system to be exported onto the Web.
Adding logic to the web means: to use rules and make inferences, choose courses of action, and answer questions. The logic must be powerful enough to describe complex properties of objects but not so powerful that agents can be tricked by paradoxes.

The semantic web will enabled machines to comprehend semantic documents and data, not human speech and writings.

Ontologies are collections of information, a document or file, that formally define the relations among terms. E.g. a taxonomy and a set of inference rules.

XML provides hidden labels for data. RDF provides meaning, stated in triples: subject, verb, object. Each is universally identified by an URI.

The Semantic Web can also include physical objects identified by URIs and described by RDF.

Required work:
- Proof interchange format;
- Service discovery;
- Digital signatures to achieve trust.

--

The Semantic Web is indeed an ambitious and useful research goal. However, a question lingers in my mind: How will the Web be annotated to start with?

How can people be motivated to make the extra effort to annotate their contents using the mentioned "off-the-shelf software for writing Semantic Web pages"? Maybe using an approach similar to CAPTCHAs?

No comments: