I'm trying to extract the link to the NYTimes Topic page (among the topic_equivalent_webpage values) for Barack Obama from Freebase, but my query doesn't return any results, although it's on the webpage (http://www.freebase.com/m/02mjmr). This is my query:
[{
"id": "/en/barack_obama",
"type": "/common/topic",
"topic_equivalent_webpage": {
"value": null,
"value~=": "*nytimes*"
}
}]
I've also tried extracting all of the topic_equivalent_webpage values instead, using this query:
[{
"id": "/en/barack_obama",
"type": "/common/topic",
"topic_equivalent_webpage": []
}]
For some reason it only returns one of the values (http://www.worldcat.org/wcidentities/lccn-n94-112934).
Does anyone have any tips?
NOTE: All Freebase APIs are going away in a few months.
You have three choices:
Download the RDF dump and filter it. This is most appropriate for a large-scale download instead of using the API. For the property name and decoding process, see #3.
Use the Topic API i.e. https://www.googleapis.com/freebase/v1/topic/en/barack_obama?filter=/common/topic/topic_equivalent_webpage
Query MQL for the keys in the namespace that you want (ie the NY Times namespace)
[{
"id": "/en/barack_obama",
"key": [{
"namespace": "/source/nytimes",
"value": null
}]
}]
Normally the result is an identifier that gets substituted into a URI template, but in the NYT case it's basically a full URI path that just gets appended to http://nytimes.com/
The key value (e.g. top$002Freference$002Ftimestopics$002Fpeople$002Fo$002Fbarack_obama
) will be MQL key encoded, so they'll need to be decoded, but in this case you can probably cheat and replace all "$002F" substrings with "/". If any other characters are encoded, just replace $dddd with the character that has that Unicode code point.