Wordnet for the Web

This page documents the xmlns.com Wordnet vocabulary service, an RDF schema based on version 1.6 of the Wordnet lexical database.

Status

Our Wordnet service is experimental: please do not rely on what we have here for production grade work. We are serious about developing a mature version of xmlns Wordnet to underpin RDF vocabulary deployment, and caution that flaws in the current design may need fixing.

Note that other RDF representations of Wordnet now exist; a comparison would be useful.

Details

The current service is based solely on the noun term hierarchy from Wordnet. The hypernym hierarchy (a ___ is a kind of ___) is projected into an RDF Schema 1.0 class hierarchy (the ___ class is an rdfs:subClassOf the class ___). This creates a rather large class hierarchy.

You can address each noun from Wordnet using Web identifiers (URIs) of the form: http://xmlns.com/wordnet/1.6/xxx where 'xxx' is a word such as 'Cat', 'Tree', 'Person'. If you de-reference these URIs, the xmlns.com service provides an RDF description of that RDF class, plus detailed descriptions of immediate sub-classes and a list of the super-classes above it in the hierarchy.

The word you use can optionally be qualified with a numeric suffix (eg 'Dog' vs ' Dog-2') corresponding to the wordnet sense numbers associated with that word.

Deployment

You can import this vocabulary into RDF instance data, for example by writing xmlns:wordnet="http://xmlns.com/wordnet/1.6/"

Having done this, you can use any noun from Wordnet as an RDF class (ie. type). Since RDF syntax provides a special convention for writing class names as XML elements, this can be rather useful.

Issues

In Wordnet, hypernym relations are between word-clusters ("synonym sets") not between the terms themselves. Typically a group of related words are organised together in wordnet as a synonym set (which will have some obscure numerical identifier, but does not correspond to any particular "dominant" or "representative" term from the synset, ie each term in a synset is equal). Consequently, when we create sub-class relations between RDF classes based on words in Wordnet, we ignore some of this structure. Our current implementation is based on real-time queries to the Wordnet tools, and will need re-engineering at the software and schema level to better improve this situation. For RDFWeb and xmlns.com, it is important to use real words ('Cat','Person' etc) as the class identifiers, so that RDF/XML written using this schema is readable and intuitive. We don't want numeric IDs instead, since this will create usability issues for deployment. A possible workaround is to have RDF classes both for each word in Wordnet (one per each member of the synset) and another RDF class for the synset itself. More work needed...

Next steps...

Discussion is welcomed. Send mail to the RDFWeb rdfweb-dev list, or join the #rdfig IRC channel if you're interested.

One useful thing to do would be to make an RDDL description of the Wordnet namespace available.

Dan Brickley