Search code examples
pythonparsingwikipediawikipedia-api

API for getting edits on Wikipedia


I want to get the text of the edit made on a Wikipedia page before and after the edit. I have this url:

https://en.wikipedia.org/w/index.php?diff=328391582&oldid=328391343

But, I want the text in the json format so that I can directly use it in my program. Is there any API provided by MediaWiki that gives me the old and new text after an edit or do I have to parse the HTML page using a parser?


Solution

  • Try this: https://www.mediawiki.org/wiki/API:Revisions

    There are a few options which may be of use, such as:

    1. rvparse: Parse revision content. For performance reasons if this option is used, rvlimit is enforced to 1.

    2. rvdifftotext: Text to diff each revision to.

    If those fail there's still

    1. rvprop / ids: Get the revid and, from 1.16 onward, the parentid

    Then once you get the parent ID, you can compare the text of the two.