I am currently using the following code to obtain the page content of wikipedia.
import pywikibot as pw
page = pw.Page(pw.Site('en'), 'Forensic science')
page.text
However, it seems like the above code does not reaturn content in Talk pages
of wikipedia. e.g.,
import pywikibot as pw
page = pw.Page(pw.Site('en'), 'Talk:Forensics science')
page.text
More precisely, I want to get the content of this page: https://en.wikipedia.org/w/index.php?title=Talk:Forensic_science&action=edit
I am happy to provide more details if needed. :)
You have a typo in the talk page title. 'Talk:Forensics science'
should have been 'Talk:Forensic science'
(without the s
at the end of Forensic
). Other than that it should work as you expect.
If you want to get an explicit error when the page does not exist try the Page.get
method:
import pywikibot as pw
page = pw.Page(pw.Site('en', 'wikipedia'), 'Talk:Forensics science')
text = page.get()
this will raise:
[...]
"...site.py", line 4166, in loadrevisions
raise NoPage(page)
pywikibot.exceptions.NoPage: Page [[wikipedia:en:Talk:Forensics science]] doesn't exist.
CRITICAL: Exiting due to uncaught exception <class 'pywikibot.exceptions.NoPage'>