Search code examples
wikipediamediawiki-api

prop=extracts not returning all extracts in the WikiMedia API


I would like to use the wikipedia API to return the extract from multiple wikipedia articles at once. I am trying, for example, the following request (I just chose the pageids randomly):

http://en.wikipedia.org/w/api.php?format=xml&action=query&pageids=3258248|11524059&prop=extracts&exsentences=1

But it only contains the extract for the first pageid, and not the second. Other properties seem not to have this limitation. For example

http://en.wikipedia.org/w/api.php?format=xml&action=query&pageids=3258248|11524059&prop=categories

will return the categories for both pageids. Is this a bug, or am I missing something?


Solution

  • Notice the <query-continue> element. It tells you that to get more of the extracts, you need to specify excontinue=1:

    http://en.wikipedia.org/w/api.php?format=xml&action=query&pageids=3258248|11524059&prop=extracts&exsentences=1&excontinue=1

    You should be able to get both of them, by specifying exlimit=max:

    http://en.wikipedia.org/w/api.php?format=xml&action=query&pageids=3258248|11524059&prop=extracts&exsentences=1&exlimit=max

    But this does not seem to work correctly, I'm not sure why.

    BTW, categories have similar limitations, which is why your categories query has <query-continue> too and why it doesn't list all categories of the articles.