Search code examples
mediawikiwikimediawiki-api

How to know if a Wiki page is for a person


I search a word on Wiki pages using Wiki API. I need to know if that word is a name for a person.

For example searching "Leonardo Dicaprio"

https://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=Leonardo%20Dicaprio&utf8=

I need to know from the query result if this is a name for a person


Solution

  • You would probably be better off doing this via the Wikidata Query Service and Sparql.

    Something like this might work:

    SELECT DISTINCT ?person ?personLabel ?article WHERE {
      ?person wdt:P31 wd:Q5 .
      ?person rdfs:label ?personLabel .
      FILTER( LANG(?personLabel) = "en")
      FILTER( CONTAINS(LCASE(?personLabel), "leonardo dicaprio") ) .
      ?article schema:about ?person .
      ?article schema:isPartOf <https://en.wikipedia.org/> .
    }
    LIMIT 10
    

    (If that times out, you could add more specific searches, e.g. with 'country of citizenship': ?person wdt:P27 wd:Q30)