I wonder how can I found out how many labels in Wikidata are for each language, out of the total amount of 50 millions entries.
For example, in https://query.wikidata.org , for Catalán language ("ca") I tried with
SELECT ?lang (COUNT(DISTINCT ?item) AS ?count) WHERE {
?item schema:inLanguage "ca" .
} GROUP BY ?lang
ORDER BY DESC (?count)
and got a result of 703351, but I think it's not correct because I downloaded the Wikidata dump (from https://dumps.wikimedia.org/wikidatawiki/entities/ ), and I already extracted more than two millions of labels in Catalán (and the extraction process is still running)
So, any clue on what am I doing wrong?
As suggested in the notes above, using Quarry:
https://quarry.wmflabs.org/query/27976
USE wikidatawiki_p;
DESCRIBE wb_terms;
SELECT COUNT(*) FROM wb_terms
WHERE term_type = 'label' AND term_language = "ca";