RDF
From Freebase
This article describes Beta level software, which means that it is not yet fully stable, and may change in backwards-incompatible ways, but is intended to eventually become a fully supported product. Please use at your own risk.
RDF is a key component of Freebase's Semantic web functionality. Each topic in Freebase is available in RDF format, which makes it part of the Linked Open Data cloud. The RDF service was launched as Beta level project in October 2008.
(The following explanation of Freebase's RDF service is taken from the blog post by Jamie Taylor when the RDF service was launched.)
The RDF service provides topic views using the Resource Description Framework. RDF is a very general approach to data modeling that has become a standard (actually, a W3C Recommendation) for exchanging graph data structures.
The Freebase RDF service allows applications to retrieve a subgraph of data connected to a specific Freebase object through a simple HTTP GET request. The URI for this request is simply the concatenation of the RDF service URL and the Freebase identifier (/type/object/id) with the slashes in the ID replaced with dots (this conversion to a “dot notation” makes the QNames with these identifiers much nicer.) So the the URI for the Topic about the movie Blade Runner (/en/blade_runner) becomes http://rdf.freebase.com/ns/en.blade_runner.
This URI, which represents a Freebase ID, comes in very handy when you want to unambiguously refer to a concept on the web. For instance, while cataloging photos in a collection it would useful to disambiguate between photos of the band Pantera (http://rdf.freebase.com/ns/en.pantera) and the Pantera car (http://rdf.freebase.com/ns/en.de_tomaso_pantera). While you could make up “tags” which would help identify the two different meaning of “Pantera,” these tags would be idiosyncratic to your collection. By using the RDF service URI as a strong identifier for the concept being represented, anyone (or any machine) on the web can figure out what you are talking about.
Of course having unambiguous identifiers gets much more interesting when lots of people start using them. The Linked Data community is an effort to create an ecosystem of interlinked RDF data using shared concept identifiers, and shared vocabularies to describe the relationships between data elements. These efforts represent the first steps towards a semantic web.
For developers using RDF, Freebase provides a large collection of vocabularies through its schema models, which express a wide range of domains and areas of interest. Since the Freebase RDF service references the live Freebase graph, developers can model their own vocabularies by creating domains, types and properties within Freebase and referencing the vocabulary using an RDF service URIs. This makes Freebase an excellent place to coordinate work on collaborative models that will be used by several external sites.
Another one of Freebase’s strengths is its ability to maintain “keys” (/type/object/key) which contain references to other (external) data sources. We currently use such keys to link to Wikipedia, IMDB, Netflix, GNIS, IATA, and other such sources. Since Freebase attempts to merge all references to a concept onto a single Topic, once an application has referenced a Topic it will have access to all the information Freebase knows about the concept in one place, including all the keys for finding more information about the concept. One of the current data modeling efforts is to represent how keys can be converted into external URIs using things like URI Templates (/common/uri_template) for use within the Linked Data community.
The current Freebase RDF service generates a rather raw view of Freebase Topics. The vocabularies used to describe the relationships between pieces of data in the graph only refer to Freebase schemas. Ideally there would be a way to say that a Freebase /people/person was comparable to the Friend-of-a-Friend vocabulary’s Person. While we could have simply asserted that equivalence (and many others,) it seems more useful if the community could build a data model within Freebase that specified the association between various vocabularies. That way, users could specify their own mappings and, in the future, ask the RDF Service to generate data using the vocabulary most natural for their own purposes. A nascent effort towards this type of data model exists at http://www.freebase.com/view/user/jamie/web_ontology. We encourage RDF enthusiasts to expand upon the idea and discuss various approaches on the mailing list.
We look forward to working with developers to create useful applications on top of the RDF service and expanding the web of data.
Note: the RDF service uses content negotiation to determine what type of data should be returned to the HTTP client. Standard web browsers prefer HTML, so the RDF service will redirect you to the regular, human readable Freebase view. If you want to see the RDF output using a regular web browser, you can fetch the “information resource” directly by using URIs like: http://rdf.freebase.com/rdf/en/blade_runner or by installing the Tabulator plugin for Firefox.
Known issues
- Due to a bug, the RDF interface does not return all data for a topic if the topic contains more than 100 incoming links or more than 100 outgoing links. There is currently no timescale for fixing this issue.