Search code examples
pythonmediawikiwikipediapywikibot

How to get the content in Talk pages of wikipedia in python


I am currently using the following code to obtain the page content of wikipedia.

import pywikibot as pw

page = pw.Page(pw.Site('en'), 'Forensic science')
page.text

However, it seems like the above code does not reaturn content in Talk pages of wikipedia. e.g.,

import pywikibot as pw
page = pw.Page(pw.Site('en'), 'Talk:Forensics science')
page.text

More precisely, I want to get the content of this page: https://en.wikipedia.org/w/index.php?title=Talk:Forensic_science&action=edit

I am happy to provide more details if needed. :)


Solution

  • You have a typo in the talk page title. 'Talk:Forensics science' should have been 'Talk:Forensic science' (without the s at the end of Forensic). Other than that it should work as you expect.

    If you want to get an explicit error when the page does not exist try the Page.get method:

    import pywikibot as pw
    page = pw.Page(pw.Site('en', 'wikipedia'), 'Talk:Forensics science')
    text = page.get()
    

    this will raise:

    [...]
    "...site.py", line 4166, in loadrevisions
        raise NoPage(page)
    pywikibot.exceptions.NoPage: Page [[wikipedia:en:Talk:Forensics science]] doesn't exist.
    CRITICAL: Exiting due to uncaught exception <class 'pywikibot.exceptions.NoPage'>