Search code examples

Collect triples from Wikidata using SPARQL

I would like to collect triples (subject, predict, object) from a Wikidata entity (e.g. Helen Keller). Currently, I have a query as below:

SELECT ?wd ?wdLabel ?ps_Label ?ps_Description WHERE {     
  VALUES (?item) {(wd:Q38203)}    
  ?item ?p ?statement .   
  ?statement ?ps ?ps_ .    
  ?wd wikibase:claim ?p.   
  ?wd wikibase:statementProperty ?ps.   
  SERVICE wikibase:label {          
    bd:serviceParam wikibase:language "en".    

I have two questions about this:

  1. It seems that many meaningless triples will also be listed, such as properties ("CONOR.BG ID", "National Library of Ireland ID"). But all I want is the crucial triples. Can I filter them directly, or is there a score I can rank?

  2. The above query retrieving triples seems the entity acts as a subject, not an object. In other words, the format (Helen Keller, ?, ?) can be retrieved, not (?, ?, Helen Keller). But I intend to collect all triples even though it is an object entity.


    1. If you just want to exclude identifiers, you can add the constraint:
         ?wd wdt:P31/wdt:P279* wd:Q6545185 .

    this is excluding all the properties that are instances of (subclasses of) unique identifier (Q6545185)

    1. I would suggest you to just do another query swapping ?item and ?ps_. In case you want to achieve this with a single query, you can replace
       ?item ?p ?statement .
       ?statement ?ps ?ps_ .


       { ?item ?p ?statement .
         ?statement ?ps ?ps_ . }
       { ?subj ?p ?statement .
         ?statement ?ps ?item . }

    and also select ?subj in order to mark the tuples referring to the "inverted" relation.

    The resulting query would be as follows:

    SELECT ?subj ?subjLabel ?subjDescription ?wd ?wdLabel ?ps_Label ?ps_Description WHERE {     
      VALUES (?item) {(wd:Q38203)}    
      { ?item ?p ?statement .
        ?statement ?ps ?ps_ . }
      { ?subj ?p ?statement .
        ?statement ?ps ?item . }
      ?wd wikibase:claim ?p.   
      ?wd wikibase:statementProperty ?ps.   
      SERVICE wikibase:label {          
        bd:serviceParam wikibase:language "en".    
        ?wd wdt:P31/wdt:P279* wd:Q6545185 .
    ORDER BY ?subj ?wd