Search code examples
sparqlwikidata

How to Query Wikidata for US persidents?


How to Query Wikidata for US presidents, who started their presidency after 1970 AND have at least one daughter?

I know how to query for US presidents but I don't know how to filter my search.

SELECT ?presLabel ?spouseLabel ?picp ?picps ?daughterLabel ?numchild ?start WHERE {

   ?pres wdt:P31 wd:Q5 .
   ?pres p:P39 ?position_held_statement .
   ?position_held_statement ps:P39 wd:Q11696 .
   ?position_held_statement pq:P580 ?start .


   ?pres wdt:P26 ?spouse .
   ?pres wdt:P40 ?childern.
   ?pres wdt:P40 ?daughter.
   ?daughter wdt:P21 wd:Q6581072.
   ?pres wdt:P1971 ?numchild.

   OPTIONAL {
        ?pres wdt:P18 ?picp
    } 

   OPTIONAL {
        ?spouse wdt:P18 ?picps
    } 

   SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
   }

 } ORDER BY ?start

Solution

  • Not that difficult:

    1. You already have the start time which is of type xsd:date. All you have to do now is to extract the year and use this value for filtering:

      FILTER(year(?start) > 1970)

    2. Your second condition "at least one daughter" is implicitly satisfied by the join, i.e. the presidents only match if there is such data. It would become more "interesting" if you're asking for "at least two daughters" because then you would have to group by each president, count the daughters and finally filter the groups:

      GROUP BY ?p 
      HAVING (count(?child) > 2)
      

    Your final query:

    SELECT ?presLabel ?spouseLabel ?picp ?picps ?daughterLabel ?numchild ?start WHERE {
    
       ?pres wdt:P31 wd:Q5 .
       ?pres p:P39 ?position_held_statement .
       ?position_held_statement ps:P39 wd:Q11696 .
       ?position_held_statement pq:P580 ?start .
    
       # the spouse
       ?pres wdt:P26 ?spouse .
       # all daughters
       ?pres wdt:P40 ?daughter.
       ?daughter wdt:P21 wd:Q6581072.
       # number of children
       ?pres wdt:P1971 ?numchild.
    
       OPTIONAL {
            ?pres wdt:P18 ?picp
        } 
    
       OPTIONAL {
            ?spouse wdt:P18 ?picps
        } 
    
      # filter by start date here
      FILTER(year(?start) > 1970)
    
       SERVICE wikibase:label {
        bd:serviceParam wikibase:language "en" .
       }
    
     } ORDER BY ?start
    

    Additional Task

    The number of daughters can be retrieved in a subquery which simply has to be put inside the original query + adding the variable ?numDaughters to the selected variables:

    # here we compute the number of daughters 
    { 
     select ?pres (count(distinct ?daughter) as ?numDaughters) where {
            ?pres wdt:P31 wd:Q5 .
            ?pres p:P39 ?position_held_statement .
            ?position_held_statement ps:P39 wd:Q11696 .
            # all daughters
            ?pres wdt:P40 ?daughter.
            ?daughter wdt:P21 wd:Q6581072.
     } 
     group by ?pres
    }