Search code examples
metadatasparqlwikidatawikimediawikidata-api

Get Wikidata item view count / popularity index


Here's my SPARQL query to list mathematicians with their Wikipedia links and images:

SELECT DISTINCT ?pers ?persLabel ?nameLabel ?persDescription ?link ?img
WHERE {
  ?pers wdt:P31 wd:Q5.
  {?pers wdt:P101* wd:Q395} union {?pers wdt:P106* wd:Q170790}.
  ?pers wdt:P734 ?name.
  optional {?link schema:about ?pers; schema:isPartOf <https://en.wikipedia.org/>. }
  optional {?pers wdt:P18 ?img. }
  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
limit 100

Now what I want next is filter the list by the most famous / popular entries using some kind of popularity index. In a Wikimedia Action API query, I would have used the page viewcount property to get average view counts over some period (say 60 days), and thus would land at an estimate of the user popularity of the articles, e.g.

https://www.mediawiki.org/w/api.php?action=query&generator=allpages&gaplimit=max&gapfilterredir=nonredirects&gapfrom=a&prop=pageviews

Yet I don't know if such metrics exists for Wikidata as well, or whether any other index may be used for this purpose.


Solution

  • Pageview numbers are recorded, as clicking on Page Information in the navigation bar on the left, on any Item view, will show. The numbers are also available in this tool and, I would bet, with the API. Indeed, just changing the hostname to in your example URL to Wikidata works.

    That doesn't quite help if you want/need the data within the query interface, however. For that, I suggest using a different proxy for "popularity". A common one is the number of language versions that have articles about the subject, or "sitelinks. Here's how that would work:

    [... your query as before ...]
        SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    
        ?pers wikibase:sitelinks ?sitelinks.
    }  order by desc(?sitelinks)
    

    Alternatively, you might try the number of publications:

        ?publication wd:P50 ?pers.
    } GROUP BY ?pers ORDER BY desc(COUNT(?publication))
    

    ...but I'm afraid Wikidata isn't complete enough for that to be reliable, especially since many scientific papers aren't properly linked with their authors.