Freebase Loader
From Freebase
This article describes Alpha level software, which means that it is a highly experimental project that may change or disappear without notice. You are welcome to experiment with this software, but please do not rely on it.
The Freebase Loader (aka Spreadsheet Loader) is an open source web-based tool that allows you to load data from spreadsheets to Freebase. It has been used to load large data sets of topics and properties. It can handle datasets both simple and quite complex, including CVTs. Freebase Loader reconciles items in your spreadsheet with topics in Freebase, asking you for confirmation and information on matches it is unsure of. It can load data to both Sandbox and directly to Freebase.
Freebase Loader can also be used to match data with topics in Freebase without ever loading it into Freebase. This can be quite helpful when you have a set of proprietary data that you'd like to interface with the rich open datasets in Freebase.
Contents |
Getting Started
There's a visual walkthrough at Freebase_Loader/Getting_Started
The input format is described at Freebase_Loader/Input_Format
Additional information about the Recon service itself at http://data.labs.freebase.com/recon/
Features
- You can stop and save your results at any time. At any point you can copy out a spreadsheet that contains all of the reconciliation work you've done so far. If you come back later to Freebase Loader you can paste it back in and pick up right where you left off.
- Asserts included types
- Supports JSON input
Recent Updates:
March 2010
- Undo support (more info)
- Updated UI
- JSON output
- Implied Types now asserted (or conversely, bare properties are not) (more info)
February 2010
- Repeated topics in your data can be treated as a single freebase topic, eliminating repeated work and preventing duplicate entities being created on upload. Details here
January 2010
- When you create new topics in Freebase.com through a TripleLoader job, the ids of the newly created topics are now added to your spreadsheet
- The interface for uploading to Tripleloader has been greatly improved
- Instead of seeing a JSON response page you get a progress bar of your job's progress and a link to our new UI for more details
- Support for indexed properties
In Edge Version
A bleeding-edge version of Freebase Loader is available at http://data.labs.freebase.com/loader_dev/
Major features often land here first for testing before rolling out to the main Loader URL.
Known Issues
The bug tracker for Loader related issues can be found here, below is a summary of important caveats to using Freebase Loader today, as it is still a labs project:
- The reconciliation system does not bring back exact name matches if the match is untyped, or the topic in freebase that matches doesn't have the FB types asserted on it.
- The reconciliation system sometimes makes incorrect automatches. This is particularly true for topics which have very similar names, such as films in a series and differ only by their number. The reconciliation service may pick an existing one in these cases. It will also automatch different film names that contain one word in common with another title if it is by the same director.
- It isn't really clear to the user that the recon system being used *is not* the same as autocomplete/freebase suggest. As a result a user may not know that they should *always* enter the proposed item through autocomplete when no match is presented.
- Freebase Loader only accepts MQL date syntax (unlike the client). A /type/datetime value that represents the first millisecond of the 21st century looks like this:
-
2001-01-01T00:00:00.001Z
- Please see the section on /type/datetime on Chapter 2. Metaweb Architecture for more information. Tracking bug
-
- Currently there is no support for loading images and blurbs, or for text in a language other than English. Blurb bug Image bug language bug
- There are relatively few error messages.
Source Code
- The source code on github - Freebase Loader doesn't require that you run a server component and will run just fine from your local machine, making experimentation and development easy