Data dumps

From Freebase

(Difference between revisions)
Jump to: navigation, search
(Link Export)
(Download)
Line 4: Line 4:
You can download Freebase Data Dumps from [http://download.freebase.com/datadumps/ here].
You can download Freebase Data Dumps from [http://download.freebase.com/datadumps/ here].
 +
 +
If you are looking for our Freebase Wikipedia Extraction than please go [http://download.freebase.com/wex/ here].
==Formats==
==Formats==

Revision as of 23:34, 7 July 2010

Metaweb provides full data dumps of all current facts and assertions in Freebase. These data dumps are general-purpose extracts of the Freebase data in three formats. Metaweb releases a fresh set of data dumps every three months.

Contents

Download

You can download Freebase Data Dumps from here.

If you are looking for our Freebase Wikipedia Extraction than please go here.

Formats

TSV

A tab-separated file for each type in Freebase, suitable for loading into spreadsheets or database systems. Each line represents an instance of a Freebase type and columns represent the available properties for the type. You may download the full set, or browse Freebase domains and types to find specific data sets.

  • The full download is approximately 1300 Mbytes compressed with Bzip2.
  • The browseable set contains approximately 7500 TSV files in 100 domains.

Link Export

A full dump of Freebase assertions as tab separated utf8 text. This is a complete "low level" dump of data which is suitable for post processing into RDF or XML datasets. The format of the link export is a series of lines, one assertion per line. The lines are tab separated quadruples, <source>, <property>, <destination>, <value> An assertion is a statement of fact about the <source> object. In any assertion, either the <destination> or <value> or both <destination> and <value> are present.

  • A sample of this output is available.
  • The Link Export is approximately 4000 Mbytes compressed with Bzip2.

Simple Topic Dump

A tab-separated file containing basic identifying data about every topic in Freebase. The columns are: GUID, English display name, Freebase /en keys (comma-separated), numeric English Wikipedia keys (comma-separated), Freebase types (comma-separated), and a short text description from Wikipedia (when available). Tabs and newlines are backslash-escaped, and null fields are represented by "\N".

  • The Simple Topic Dump is approximately 1000 Mbytes compressed with Bzip2.

License

Freebase Data Dumps are provided free of charge for any purpose with regular updates by Metaweb Technologies. They are distributed, like Freebase itself, under the Creative Commons Attribution (CC-BY) license and use is subject to the Freebase Terms of Service. If you include the data from these data dumps in a website or application, you must attribute us as described in the Freebase Licensing Policy.

Citing

If you'd like to cite these data dumps in a publication, you may use:

Metaweb Technologies, Freebase Data Dumps, http://download.freebase.com/datadumps/, <month> <day>, <year>

Or as BibTeX:

@misc{metaweb:datadumps,
  title = "Freebase Data Dumps" 
  author = "Metaweb Technologies",
  howpublished = "\url{http://download.freebase.com/datadumps/}",
  edition = "<month> <day>, <year>",
  year = "<year>"
}

See also

Personal tools