Search code examples
jsonmediawiki-api

Retrive all image URLs from MediaWiki API


I know how to get all images for some given titles using the MediaWiki API, and I've been using it like this (all images on the "Steve Jobs" and "Spider" articles):

http://en.wikipedia.org/w/api.php?action=query&prop=images&format=json&imlimit=200&titles=Steve_Jobs|Spider

However, the images are given in this format:

...
    images: [
    {
    ns: 6,
    title: "File:A garden spider in Chennai.JPG"
    },
    {
    ns: 6,
    title: "File:Agelenidae labyrinthica.JPG"
    },
    {
    ns: 6,
    title: "File:Ant Mimic Spider.jpg"
    }, ...

Which sucks because I don't have much use for the titles. I know I can then just query for the urls of the images using the titles like this:

http://en.wikipedia.org/w/api.php?action=query&prop=imageinfo&format=json&iiprop=url&iilimit=10&titles=File:A%20garden%20spider%20in%20Chennai.JPG|File:Agelenidae%20labyrinthica.JPG |...

but this is sort of expensive programatically because I have to store all the titles in an array, then make a string of all the titles seperated by the | character, and then I have to go to that URL. Is there a better way to do this?

Using the sandbox might help you figure it out.

I want a way to query the MediaWiki API to get all the URLs of all the images, from an array of Article subjects (i.e. "Steve_Jobs|Spider|Cat").


Solution

  • You can use prop=images as a generator and feed it to prop=imageinfo. Something like:

    http://en.wikipedia.org/w/api.php?format=json&action=query&generator=images&gimlimit=max&prop=imageinfo&iiprop=url&titles=Steve_Jobs|Spider