Search Cookbook

From Freebase

Jump to: navigation, search

This page contains a list of recipes for different ways to constrain search queries using the Search Service.


Textual Constraints

Textual data for a Freebase entity comes first from its name and its aliases, then from its keys and other textual properties and finally from its Wikipedia anchor data if it was reconciled with a language-specific Wikipedia topic.

Textual constraints are language-specific; currently, nine languages are supported: english (en), spanish (es), french (fr), german (de), italian (it), portuguese (pt), chinese (zh), japanese (ja) and korean (ko). English has by far the most coverage and is the default language.

A textual constraint is specified with the query parameter. Its language is specified with the lang parameter. For example:

query: "gore"
query: "gore" lang: "fr"
query: "gore" lang: "de"

During indexing, textual data is normalized in a language-specific way. For example, in english, text is lowercased and accents are removed. At query time, the same language-specific normalization is performed on the query text. For example:

query: "beyoncé"
query: "beyonce"

Several parameters control how a textual constraint is matched:

prefixed: set to true triggers a prefix match on name and aliases only (and a regular match on other textual data). For example:

query: "bob dy" prefixed: true

stemmed: set to true triggers a stemmed match on name and aliases only (and a regular match on other textual data). Stemmed matches may be used to paste over language-specific suffix differences introduced by plurals or other grammatical forms. For example:

query: "potatos" stemmed: true

A phrase match is triggered by surrounding the query text with double quotes "". The text tokens in the query must appear next to each other in the matching entities textual data. For example:

query: "\"to be or not to be\""

Matching on a Name or Alias

alias: will match on both a name and aliases (/common/topic/alias "Also known as").

name: will match only on a name.

Match on /people/person entities whose name contains the word 'gore':

filter: "(all name:gore alias:gore type:/people/person)"

Match on /people/person entities whose alias only AND NOT their name contains the word 'gore':

filter: "(all (not name:gore) alias:gore type:/people/person)"

In addition to specifying what text fields should be matched it is also possible to specify how the match should occur by inserting one of the following modifiers between the operand and the text field:

  • {word} : require that the words in the string match words in the corresponding text field in the document. (default)
  • {phrase} : require that the words occur next to each other in the same order in the corresponding text field in the document.
  • {full} : like {phrase} but also require that the phrase exactly match the text field, not just words within the text field. Known as a "full match".

For example, to find the musical single called Home by Marc Broussard, you would use a filter like this:

filter: "(all type:/music/single name{full}:"home" /music/track/artist:"Marc Broussard")"

Language Constraints

As described with textual constraints, the lang parameter is used to specify what language normalization rules to use to transform text into query tokens. The language of the query also conditions result ranking as freebase-search gets a language-specific relevance signal from the corresponding language Wikipedia.

Currently, nine languages are supported: english (en), spanish (es), french (fr), german (de), italian (it), portuguese (pt), chinese (zh), japanese (ja) and korean (ko). English has by far the most coverage and is the default language.

There are 2 supported ways to constrain on languages:

1. Searching with multiple languages at the same time

The lang parameter accepts a comma-separated list of language codes that cause the search to be done in all the languages specified and the results to be ranked in the first language listed and displayed in the first language of the list that has a name for the entity.

For example:

  • searching for the german word "Sonnenblume" in german and french, but ranking and displaying the results in french:
query: "Sonnenblume" lang: "fr,de"
  • searching in english for movies whose language is korean and displaying their korean name: (the english part of the query is the word "korean" in the expressed_by constraint)
filter: "(all expressed_by:korean type:/film/film)" lang: "ko,en"

2. Search regardless of language and "pretend everything is English"

Sometimes you may not know the language that your query text is in (perhaps because you are not a native speaker) and you need to search across all languages to find any matches.

The lang=s/all hackery is where all text is indexed as English regardless of language. It only works with prefixed=true and only searches on names and aliases. While it's much faster then the 1st solution, it's also produces noisy results, so you may not get the results that you might expect.

filter: "(all name:"Politecnico Torino")" lang: s/all prefixed: true

Schema Constraints

Schema constraints are specified with the type: and the domain: parameters.

type: corresponds to the /type/object/type property values of an entity. For example, restricting a search to people only:

query: "gore" type: "/people/person"

domain: corresponds to the /type/type/domain values of all /type/object/type values of an entity. For example, restricting a search to entities in french in the /film domain only:

query: "babar" domain: "/film" lang: "fr"

You can also use individual Freebase properties to filter a query. For example, restricting a search to people who are from Canada:

 query "john" filter: "(all type:/people/person /people/person/nationality:"Canada")"

Metaschema Constraints

Metaschema constraints filter entities by semantic predicates. These predicates are higher level concepts built from collections of Freebase properties describing similar semantic relationships.

The constraints are specified via filter parameter operands combined with an entity name or mid constraint. Such as in this example using the part_of filter operand:

Metaschema Predicates Index:

The following metaschema filter operands are supported by the Search API:

abstraction, from the hasabstraction semantic relationship For example:

"fettuccine dishes"

filter: "(all abstraction:fettuccine)"

abstraction_of, from the abstractionof semantic relationship For example:

"class of the Western Bulwark locomotive"

filter: "(all abstraction_of:\"Western Bulwark\")"

adaptation, from the adaptationof semantic relationship For example:

"Works La Traviata is an adaptation of"

filter: "(all adaptation_of:\"La Traviata\")"

administered_by, from the administeredby semantic relationship For example:

"Cannes awards"

filter: "(all type:awards administered_by:cannes)"

administers, from the administers semantic relationship For example:

"Who runs the Synapse newspaper ?"

filter: "(all administers:synapse)"

appears_in, from the appearsin semantic relationship For example:

"characters in the Magic Flute"

filter: "(all appears_in:\"magic flute\")"

"Figuren in der Zauberflöte"

filter: "(all appears_in:\"Die Zauberflöte\")" lang: "de"

broader_than, from the broaderthan semantic relationship For example:

"line of aircraft that the Airbus 319 belongs to"

filter: "(all broader_than:\"Airbus A319\")"

category, from the hascategory semantic relationship For example:

"french actresses"

filter: "(all category:female origin:france notable:actor)"
filter: "(all category:female origin:france practitioner_of:actor)"

"california or french volcanos"

filter: "(all category:volcano (any part_of:california part_of:france))"

"pasta dishes"

filter: "(all category:pasta)"

center, from the hascenter semantic relationship For example:

"airlines with a hub in San Francisco"

filter: "(all type:airline center:\"San Francisco\")"

"airlines with hubs in San Francisco and Atlanta"

filter: "(all type:airline center:\"San Francisco\" center:atlanta)"

"newspapers centered in San Francisco"

filter: "(all type:/book/newspaper center:/m/0d6lp)"

center_for, from the centerfor semantic relationship For example:

"sports facilities for the San Francisco 49ers"

filter: "(all center_for:\"san francisco 49ers\")"

certification, from the hascertification semantic relationship For example:

"R-rated movies by Wim Wenders"

filter: "(all type:/film/film contributor:wenders certification:r)"

character, from the hascharacter semantic relationship For example:

"works which have Papageno as character"

filter: "(all character:papageno)"

child, from the haschild semantic relationship For example:

"parents of Bill Clinton"

filter: "(all child:\"bill clinton\")"

contributed_to, from the hascontributor semantic relationship For example:

"Who contributed to Blade Runner ?"

filter: "(all contributed_to:\"Blade Runner\")"

contributor, from the hascontributor semantic relationship For example:

"movies by Steven Spielberg"

filter: "(all type:/film/film contributor:\"Steven Spielberg\")"
filter: "(all type:/film/film contributor:/m/06pj8)"

"movies with Harrison Ford"

filter: "(all type:/film/film contributor:\"Harrison Ford\")"

"movies with Harrison Ford by Steven Spielberg"

filter: "(all type:/film/film contributor:\"Harrison Ford #actor\" contributor:\"Steven Spielberg #directed_by\")"

"westerns with Harrison Ford as actor and Steven Spielberg as executive producer"

filter: "(all type:/film/film genre:western contributor:\"Harrison Ford #actor\" contributor:\"Steven Spielberg #executive_produced_by\")"

created, from the created semantic relationship For example:

"who created 'for whom the bell tolls'"

filter: "(all created:\"for whom the bell tolls\")"

created_by, from the createdby semantic relationship For example:

"software by Google"

filter: "(all notable:software created_by:google)"

discovered, from the discovered semantic relationship For example:

"discoverers of radium"

filter: "(all discovered:radium)"

discovered_by, from the discoveredby semantic relationship For example:

"discoveries by Curie"

filter: "(all discovered_by:curie)"

distributed_by, from the distributedby semantic relationship For example:

"NPR shows"

filter: "(all type:show distributed_by:npr)"

exhibited, from the exhibited semantic relationship For example:

"where was 'down by law' presented ?"

filter: "(all exhibited:\"down by law\")"

exhibited_at, from the exhibitedat semantic relationship For example:

"nominated works shown at the 2010 Cannes Film Festival"

filter: "(all type:\"nominated work\" exhibited_at:\"2010 Cannes Film festival\")"

expressed_by, from the expressedby semantic relationship For example:

"books in esperanto"

filter: "(all type:book expressed_by:esperanto)"

fictional_link, from the hasfictionalrelationship semantic relationship For example:

"fictional characters related to Mickey Mouse"

filter: "(all type:/fictional_universe/fictional_character fiction_link:\"mickey mouse\")"

genre, from the hasgenre semantic relationship For example:

"gothic cathedrals"

filter: "(all category:cathedral genre:gothic)"

"gothic cathedrals by Viollet-le-duc"

filter: "(all category:cathedral genre:gothic created_by:viollet)"

identifies, from the identifies semantic relationship For example:

"What identifies Southwest Airlines ?"

filter: "(all identifies:\"Southwest Airlines\")"

leader, from the hasleader semantic relationship For example:

"Mitch Kapor companies"

filter: "(all type:company leader:kapor)"

leader_of, from the leaderof semantic relationship For example:

"Paris mayors"

filter: "(all title:mayor leader_of:paris)"

made_of, from the composedof semantic relationship For example:

"wax paintings"

filter: "(all type:painting made_of:wax)"

means_of_demise, from the hasmeansofdemise semantic relationship For example:

"executed politicians"

filter: "(all type:politician means_of_demise:\"capital punishment\")"

member_of, from the memberof semantic relationship For example:

"african monarchs"

filter: "(all type:monarch member_of:africa)"

"Democratic politicians and notable actors"

filter: "(all type:politician member_of:democratic notable:actor)"

narrower_than, from the narrowerthan semantic relationship For example:

"examples of v8 engines"

filter: "(all type:engine narrower_than:\"v8 engine\")"

occurs_in, from the occursin semantic relationship For example:

"languages spoken in Romania"

filter: "(all type:language occurs_in:romania)"

origin, from the hasplaceoforigin semantic relationship For example:

"Republican governors from Austria"

filter: "(all title:governor member_of:republican origin:austria)"

owner, from the hasowner semantic relationship For example:

"makes owned by Ford"

filter: "(all type:make owner:ford)"

owns, from the owns semantic relationship For example:

"Who owns the Mavericks ?"

filter: "(all owns:mavericks)"

parent, from the hasparent semantic relationship For example:

"Al Gore's children"

filter: "(all parent:\"al gore\")"

"descendants of the Lisp programming language"

filter: "(all type:/computer/programming_language parent:lisp)"

part_of, from the partof semantic relationship For example:

"swedish lakes"

filter: "(all type:lake part_of:sweden)"

"competitions at the 2008 summer olympics"

filter: "(all type:competition part_of:\"2008 summer olympics\")"

participant, from the hasparticipant semantic relationship For example:

"Bowie concerts"

filter: "(all participant:bowie type:concert)" 

participated_in, from the participatedin semantic relationship For example:

"Notable austrian skiers who participated in Olympics"

filter: "(all notable:skier member_of:austria participated_in:olympics)"'

peer_of, from the peerof semantic relationship For example:

"politicians peers of Al Gore"

filter: "(all notable:politician peer_of:gore)"

permits_use_of, from the permitsuseof semantic relationship For example:

"Diesel engines"

filter: "(all permits_use_of:diesel)"

portrayed, from the portrayed semantic relationship For example:

"actors who portrayed John Lennon"

filter: "(all notable:actor portrayed:\"john lennon\")"

portrayed_by, from the portrayedby semantic relationship For example:

"characters portrayed by Harrison Ford"

filter: "(all portrayed_by:\"Harrison Ford\")"

practitioner_of, from the practitionerof semantic relationship For example:

"female african american lawyers"

filter: "(all category:female category:\"african american\" practitioner_of:lawyer)"

preceeding, from the haspreceedingwork semantic relationship For example:

"sequels to The Lord of the Rings, the two Towers"

filter: "(all type:/film/film preceeding:\"The Lord of the Rings, the two Towers\")"

produced_by, from the producedby semantic relationship For example:

"Apple computers"

filter: "(all type:computers produced_by:apple)"

publication, from the haspublication semantic relationship For example:

"which book has /m/0clw238 as first edition ?"

filter: "(all publication:/m/0clw238)"

publication_of, from the publicationof semantic relationship For example:

"releases of La Traviata"

filter: "(all publication_of:\"La Traviata\")"

service_area, from the hasservicearea semantic relationship For example:

"California broadcasters"

filter: "(all type:broadcaster service_area:california)"

status, from the hasstatus semantic relationship For example:

"retreating swiss glaciers"

filter: "(all type:glacier status:retreating part_of:switzerland)"

subclass_of, from the subclassof semantic predicate For example:

"kinds of swimwear"

filter: "(all subclass_of:swimwear)"

subject, from the hassubject semantic relationship For example:

"movies about the Holocaust"

filter: "(all type:film subject:holocaust)"

"books about mathematics"

filter: "(all type:book subject:mathematics)"

subsequent, from the hassubsequentwork semantic relationship For example:

"prequels to The Lord of the Rings, the two Towers"

filter: "(all type:/film/film subsequent:\"The Lord of the Rings, the two Towers\")"

succeeded_by, from the succeededby semantic relationship For example:

"Which automotive platform was succeeded by the Ford B3 platform ?"

filter: "(all succeeded_by:\"ford b3 platform\")"

succeeds, from the succeeds semantic relationship For example:

"Who succeeded the House of Stuart ?"

filter: "(all succeeds:stuart)"

superclass_of, from the superclassof semantic predicate For example:

"Classes coronary heart disease belongs to"

filter: "(all superclass_of:\"coronary heart disease\")"

title, from the hastitle semantic relationship For example:

"Google engineers"

filter: "(all title:engineer member_of:google)"

tookplace_at, from the tookplaceat semantic relationship For example:

"battles that took place at Marengo"

filter: "(all type:battles tookplace_at:marengo)"

use_permitted_by, from the usepermittedby semantic relationship For example:

"File formats supported on an iPhone"

filter: "(all type:\"file format\" use_permitted_by:iphone)"

Scoring and Ranking

Freebase entities have an inherent relevance score (ranking) computed during indexing that is function of its inbound and outbound link counts in Freebase and Wikipedia. Some popular Freebase entities also have a popularity score computed by Google. By default, both scores are combined together during queries.

When a textual constraint is present, a textual match score is computed from the number of hits returned by the search index and is combined with the relevance score.

FreebaseSearch results are always sorted by the final score, highest score first.

The scoring parameter makes it possible to control what relevance score components are used to compute the final score:

scoring: freebase - Use only the Freebase relevance score.

query: "beyoncé" scoring: freebase

scoring: entity - Use both relevance scores, which replaces any missing Google scores to 1.0. This is the default.

query: "beyoncé" scoring: entity

scoring: schema - Use when looking for schema entities like types, properties or domains. The link counts of schema entities is computed differently.

query: "performance" scoring: schema

Other Constraints

Entities can be filtered by index tag using the with: or without: parameters. Entities are tagged during indexing, each tag corresponding to one or several Freebase queries that would be too expensive to run during search.

with: is a parameter that is used to force only certain entities having a particular tag to be returned.

commons is one such tag that can be used to restrict a schema search to return only Freebase Commons schema. For example, Freebase Commons types matching the word "color".

query: "color" type: "/type/type" with: "commons"

gg is a tag that can be used to restrict a search to entities for which there is or isn't a Google popularity score.

query: "1923" type: "/people/person" with: "gg"
query: "1923" type: "/people/person" without: "gg"

without: parameter is equivalent to the negated not with: in a filter expression.

query: "color" limit: 5 type: "/type/type" without: "commons"
query: "color" limit: 5 type: "/type/type" filter: "(not with:commons)"
Personal tools