Search code examples
pythonpywikibot

How can i know that a specific Index template parameter is empty by pywikibot?


I am trying to fill-up page numbers of a Book in its Index Wikisource page. The following code writes well in the specific pageNumber parameter. If the page is empty, it looks fine. But if i run the code another time, due to the concatenation the 67 becomes 6767. How can i know that the pageNumber parameter ('|Number of pages=') is empty? or If the parameter already filled how can i set the skip option in the code.

The writing code;-

#!/usr/bin/env python
# -*- coding: utf-8 -*- 
import pywikibot

indexTitle = 'அட்டவணை:தமிழ் நாடகத் தலைமை ஆசிரியர்-2.pdf'
indexPages = '67'
site1 = pywikibot.Site('ta', 'wikisource')
page = pywikibot.Page(site1, indexTitle)
indexTitlePage = page.text.replace('|Number of pages=','|Number of pages='+indexPages)
page.save(summary='67')

Solution

  • you can use re - the regular expression library to search for a pattern:

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    import pywikibot
    import re
    
    indexTitle = 'அட்டவணை:தமிழ் நாடகத் தலைமை ஆசிரியர்-2.pdf'
    indexPages = '67'
    site1 = pywikibot.Site('ta', 'wikisource')
    page = pywikibot.Page(site1, indexTitle)
    print(page.text)
    res = re.compile('\|Number of pages= *(\d+)').search(page.text)
    if res:
        print("number of pages is already assign to %s" % res.group(1))
    else:
        indexTitlePage = page.text.replace('|Number of pages=','|Number of pages='+indexPages)
        page.save(summary='67')
    

    Also, if you are dealing with processing utf8 text, it's better to move to python3 as it has much better support for that.