Search code examples
sparqlwikidata-query-service

Wikidata SPARQL - Duplicate results for spouse start time and end time


I am trying to construct a query to return a list of actors and their spouses while including marriage and divorce dates for each couple. So I would expect to see each actor duplicate with each instance of a new relationship... however when I try and include the start time and end time properties in the query, I am getting duplicate results. I suspect this is because the "name" of the spouses and the is stored in a different wikidata prefix and I'm not grouping them correctly.

Here is a sample query:

SELECT ?person ?personLabel ?spouse ?spouseLabel ?starttime ?endtime
WHERE
{
  ?person wdt:P106 wd:Q33999, wd:Q2526255, wd:Q28389, wd:Q3282637;
          wdt:P26 ?spouse.
  ?person p:P26 [pq:P580 ?starttime; pq:P582 ?endtime].
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ASC(UCASE(str(?personLabel)))
LIMIT 10

here is a link to the sparql interactive service so you can see the duped results I'm referring to: https://query.wikidata.org/#SELECT%20%3Fperson%20%3FpersonLabel%20%3Fspouse%20%3FspouseLabel%20%3Fstarttime%20%3Fendtime%0AWHERE%0A%7B%0A%20%20%3Fperson%20wdt%3AP106%20wd%3AQ33999%2C%20wd%3AQ2526255%2C%20wd%3AQ28389%2C%20wd%3AQ3282637%3B%0A%20%20%20%20%20%20%20%20%20%20wdt%3AP26%20%3Fspouse.%0A%20%20%3Fperson%20p%3AP26%20%5Bpq%3AP580%20%3Fstarttime%3B%20pq%3AP582%20%3Fendtime%5D.%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22.%20%7D%0A%7D%0AORDER%20BY%20ASC%28UCASE%28str%28%3FpersonLabel%29%29%29%0ALIMIT%2010%0A

screencap of duped results


Solution

  • The problem with your query is that there was no link between the spouse and the statement about their marriage.

    So for every actor, you are returning all their spouses, and also all the start/end dates of their marriages, regardless of whether they relate to the specific spouse.

    What you need to do is to use the ps: namespace, like so:

    SELECT ?person ?personLabel ?spouse ?spouseLabel ?starttime ?endtime
        WHERE
        {
          ?person wdt:P106 wd:Q33999, wd:Q2526255, wd:Q28389, wd:Q3282637 .
          ?person p:P26 [ ps:P26 ?spouse ; #This is the necessary change.
                          pq:P580 ?starttime;
                          pq:P582 ?endtime ].
          SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
        }
        ORDER BY ASC(?personLabel)
        LIMIT 10
    

    In general, the wdt: namespace is for linking entities directly, the p: namespace links an entity to a statement, ps: links a statement to an entity, and pq: tells us something about the statement.