Search code examples
sparqlwikidatawikidata-query-service

sparql wikidata - how to only show statements without any references


I am trying to gather all statements for a specific topic (let's say Václav Havel - Q36233) but I need to exclude all statements that include any reference.

SELECT ?subjectLabel ?property ?object ?objectLabel
    WHERE {
      ?subject ?property ?object.
      FILTER (?subject = wd:Q36233)
      FILTER(REGEX(STR(?property), "http://www.wikidata.org/prop/.*"))
      FILTER(REGEX(STR(?object), "http://www.wikidata.org/entity/Q.*"))
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

This query results in 194 statements but among those are statements that also include references.

I need the result list to include this: 0 references = show

But not include this: 1+ references = not show

I tried many ways of changing the query by adding MINUS operator and FILTER operator, but none of those worked and usually resulted in 0 results.


Solution

  • You can use this query:

    SELECT DISTINCT ?subjectLabel ?property ?object ?objectLabel
    WHERE {
      ?subject ?property ?object .
      ?subject ?p ?stmt.
      ?stmt ?ps ?object .
      
      BIND (URI(REPLACE(STR(?p),STR(p:),STR(wdt:))) as ?property)
      BIND (URI(REPLACE(STR(?p),STR(p:),STR(ps:))) as ?ps)
      
      FILTER (?subject = wd:Q36233)
      FILTER (REGEX(STR(?p), STR(p:)))
      FILTER (REGEX(STR(?object), STR(wd:)))
      FILTER NOT EXISTS { ?stmt prov:wasDerivedFrom ?ref . }
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    

    where FILTER NOT EXISTS { ?stmt prov:wasDerivedFrom ?ref . } is checking that the statement ?stmt does not have a references.

    Note that ?subject ?p ?stmt. ?stmt ?ps ?object . is almost equivalent to ?subject ?property ?object ., except that the wdt: prefix searches for best statement (for example excluding deprecated ones), while the p: prefix does not. Thus ?subject ?property ?object . is not redundant because in general you cannot safely remove it.

    A more efficient version using Blazegraph's QueryHints:

    SELECT DISTINCT ?subjectLabel ?property ?object ?objectLabel
    WHERE {
      ?subject ?property ?object .
      {
        SELECT ?subject ?p ?stmt ?object (URI(REPLACE(STR(?p),STR(p:),STR(wdt:))) as ?property)
        WHERE {
          ?subject ?p ?stmt.
          ?stmt ?ps ?object .
          BIND (URI(REPLACE(STR(?p),STR(p:),STR(ps:))) as ?ps)
          FILTER (?subject = wd:Q36233)
          FILTER (REGEX(STR(?p), STR(p:)))
          FILTER (REGEX(STR(?object), STR(wd:)))
          hint:SubQuery hint:runOnce true .
        }
      }
      hint:Prior hint:runFirst true .
      FILTER NOT EXISTS { ?stmt prov:wasDerivedFrom ?ref . }
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }