Search code examples
pywikibot

How to detect maxlag exception in pywikibot


I am developing a Wikipedia bot to analyze editing contributions. Unfortunately, it takes hours to complete a single run and during that time Wikipedia's database replication delay—at some point during the run—is sure to exceed 5 seconds (the default maxlag value). The recommendation in the API's maxlag parameter is to detect the lag error, pause for X seconds and retry.

But all I am doing is reading contributions with:

usrpg = pywikibot.Page(site, 'User:' + username)
usr = pywikibot.User(usrpg)
for contrib in usr.contributions(total=max_per_user_contribs):
    # (analyzes contrib here)

How to detect the error and resume it? This is the error:

WARNING: API error maxlag: Waiting for 10.64.32.21: 7.1454429626465 seconds lagged
Traceback (most recent call last):
  File ".../bot/core/pwb.py", line 256, in <module>
    if not main():
  File ".../bot/core/pwb.py", line 250, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File ".../bot/core/pwb.py", line 121, in run_python_file
    main_mod.__dict__)
  File "analyze_activity.py", line 230, in <module>
    attrs = usr.getprops()
  File ".../bot/core/pywikibot/page.py", line 2913, in getprops
    self._userprops = list(self.site.users([self.username, ]))[0]
  File ".../bot/core/pywikibot/data/api.py", line 2739, in __iter__
    self.data = self.request.submit()
  File ".../bot/core/pywikibot/data/api.py", line 2183, in submit
    raise APIError(**result['error'])
pywikibot.data.api.APIError: maxlag: Waiting for 10.64.32.21:
    7.1454 seconds lagged [help:See https://en.wikipedia.org/w/api.php for API usage]
<class 'pywikibot.data.api.APIError'>
CRITICAL: Closing network session.

It occurs to me to catch the exception thrown in that line of code:

 raise APIError(**result['error'])

But then restarting the contributions for the user seems terribly inefficient. Some users have 400,000 edits, so rerunning that from the beginning is a lot of backsliding.

I have googled for examples of doing this (detecting the error and retrying) but I found nothing useful.


Solution

  • Converting the previous conversation in comments into an answer.

    One possible method to resolve this is to try/catch the error and redo the piece of code which caused the error.

    But, pywikibot already does this internally for us ! Pywikibot, by default tries to retry every failed API call 2 times if you're using the default user-config.py it generates. I found that increasing the following configs does the trick in my case:

    • maxlag = 20
    • retry_wait = 20
    • max_retries = 8

    The maxlag is the parameter recommended to increase according to the documentation of Maxlag parameter, especially if you're doing a large number of writes in a short span of time. But, the retry_wait and max_retries configs are useful in case someone else is writing a lot (As is my case: My scripts just read from wiki).