Compound Value Type
A Compound Value Type is a Type within Freebase which is used to represent data where each entry consists of multiple fields. Compound value types, or CVT's are used in freebase to represent complex data. It may be a little confusing at first, but CVT's are a very important part of the freebase schema, and one of the things that makes it unique, and able to represent so much.
Figure following example: Population for a city is something that changes over time. That means, whenever you query freebase for population, you are at least implicitly asking for a population at a certain date. 2 Values are involved, a number of people, and the date. Here's a situation where a CVT becomes extremely useful. Without one, to model population data, you would need to make a topic, and name it something like 'vancouver's population in 1997', and submit the information over there.
A CVT can be thought of as a topic that does not require you can make a display name. CVT's, like normal topics, have a GUID that can be referenced independently. The freebase client however treats them much differently than topics. In most cases, every property of the CVT should be a disambiguiation property.
You can tell if a property is a cvt using mql if it has
/freebase/type_hints/mediator == True
You can view this using the Explore view to look at the schema http://www.freebase.com/tools/explore/measurement_unit/dated_integer
There's a similar property on '/freebase/property' in the schema which determines which properties get displayed in the standard CVT display
/freebase/property_hints/disambiguator == True
There are two special aspects relating to the way that compound value types are displayed. First, the properties of a compound value type are intended to be displayed on a single line, so that when a user clicks edit next to a property that uses a CVT as its expected type, they will see all the properties of that type displayed next to one another. If you edit the Performances property for a film, for example, which has the CVT of Film Performances as its expected type, you'll see the empty fields for Actor, Character and Special Performance Type appear on a single line for that property. One way to think of a CVT is that it's a method for providing multiple information fields for a single property when that property uses the CVT as its expected type.
Another special aspect of a CVT is that, while each of its properties can be a type that has its own list of topics (for example, the Actor property in Film Performances has an expected type of Film Actor, which has its own list of Film Actor topics), there are no topics associated with the CVT type itself. There are, for example, no individual Film Performance topics; if you look at the Film Performance type, you'll see that there are topics listed for each of its properties, but there are no individual Film Performance topics.
Creating a CVT is relatively simple.
1. Use either the Schema tab within a base or the My Types tab on your personal homepage to create a new type.
2. Click on the name of the new type in either location to open the Schema Editor.
3. Click edit next to display.
4. Select Compound value type, then click save.
5. Use Add New to create the properties for your CVT just as you would the properties for any other type.
6. For each property be sure to select Disambiguator under Set display preferences.
Note that you cannot use a CVT as the expected type for a property within another CVT. For example, if you are creating a CVT that should display money values, you would have to have separate properties for currency and amount, rather than using the existing Dated Money Value CVT.
CVT Merge Logic
When merging two topics that have CVTs, problems with duplication can occur. Consider if we had two "Syd Barrett" topics that were flagged for merge, and that both of these topics had a CVT that represented that he was a guitarist in Pink Floyd:
/en/syd_barrett_1 --(member)--> /guid/9201..c453 --(group)--> /en/pink_floyd | +-----------(role)---> /en/guitarist /en/syd_barrett_2 --(member)--> /guid/9201..1fa3 --(group)--> /en/pink_floyd | +-----------(role)---> /en/guitarist
When merging the two topics, the merge code would naively just see two separate /music/group_member/member properties pointing to two separate topics. This would end up with what looks to be duplicate information on the merged topic:
/en/syd_barrett_1 --(member)--> /guid/9201..c453 --(group)--> /en/pink_floyd | | | +-----------(role)---> /en/guitarist | +----------(member)--> /guid/9201..1fa3 --(group)--> /en/pink_floyd | +-----------(role)---> /en/guitarist
To solve this, CVT deduplication logic is used. Again, the "identity" of a CVT is really unimportant -- they are really identified by their links. If two CVTs have the exact same set of links, they can be considered duplicates. No extra information is encoded by the second CVT, and it can be safely deleted:
/en/syd_barrett_1 --(member)--> /guid/9201..c453 --(group)--> /en/pink_floyd | +-----------(role)---> /en/guitarist
This principle can be extended. If one CVT only contains a perfect subset of the links of another CVT, it too can be deleted since it provides no extra information. So, if in the above example one of the CVTs only specified that Syd Barrett was a member of Pink Floyd (no role), then it too could safely be deleted:
/en/syd_barrett_1 --(member)--> /guid/9201..c453 --(group)--> /en/pink_floyd | | +----------(member)--> /guid/9201..1fa3 --(group)--> /en/pink_floyd | +-----------(role)---> /en/guitarist /en/syd_barrett_1 | | +----------(member)--> /guid/9201..1fa3 --(group)--> /en/pink_floyd | +-----------(role)---> /en/guitarist
However, if two CVTs have overlapping links (but not a perfect subset), then deduplication cannot occur.
This logic is carried out in a Constant Gardener script that looks for duplicate CVTs on a nightly basis.