Implement Wiki: "API:Get the contents of a page" in Python

I'm learning to use Wiki API to get the public information of users. I found the script, "get_users.py", in MediaWiki-API-demos that can help us get general information, like edit count or email address. However, the personal description on the user page can not be fetched in this way.

(An example is shown below. I want to get the text information like "I'm not usually active on English Wikipedia. Please refer ...")

I found that "API: Get the contents of a page" offers an option to achieve that. Because I know nothing about PHP, may I ask is there any way we can get these textual contents using the API in Python?

Thank you a lot for your time in advance!

Update:

I'm trying to search for the user information of a user list like the following: If I want to search personal statements of them, is there any way we can execute them at once, instead of looping them one by one then inputting into the script? (It comes from the demo: get_pages_revisions.py)

(Suppose we want to find the info of Catrope and Bob, the following implementation by modifying PARAMS cannot work correctly:

PARAMS = {
    "action": "query",
    "prop": "revisions",
    "titles": "User:Catrope|Bob",
    "rvprop": "timestamp|user|comment|content",
    "rvslots": "main",
    "formatversion": "2",
    "format": "json"
}

)

Solution

You don't have to know PHP to use information from API: Get the contents of a page. There are only URLs with extension .php - nothing more - and you can use these URLs with any language - ie. python. Even code in get_users.py uses URL with extension .php and it doesn't use PHP code for this.

You have to only add &format=json to get data as JSON instead of HTML

I don't know which URL you need to get data but you can use it as string

import requests

r = requests.get("https://en.wikipedia.org/w/api.php?action=parse&page=Pet_door&prop=text&formatversion=2&format=json")

data = r.json()

print(data['parse']['text'])

Or you can write params as dictionary - like in get_users.py - and it is more readable for user and it is easier to change param

import requests

params = {
    'action': 'parse',
#    'page': 'Pet_door',
    'page': 'USER:Catrope',
#    'prop': 'text',
    'prop': 'wikitext',   
    'formatversion': 2,
    'format': 'json'
}

r = requests.get("https://en.wikipedia.org/w/api.php", params=params)
data = r.json()

#print(data.keys())
#print(data)
#print('---')

#print(data['parse'].keys())
#print(data['parse'])
#print('---')

#print(data['parse']['text'])    # if you use param `'prop': 'text'
#print('---')

print(data['parse']['wikitext']) # if you use param `'prop': 'wikitext'
print('---')

# print all not empty lines
for line in data['parse']['wikitext'].split('\n'):
    line = line.strip()  # remove spaces
    if line: # skip empty lines
        print('--- line ---')
        print(line)

print('---')

# get first line of text (with "I'm not usually active on English Wikipedia. Please refer...")
print(data['parse']['wikitext'].split('\n')[0])

Because for 'prop': 'text' it returns HTML then it would need lxml or BeautifulSoup to search information in HTML. For 'prop': 'wikitext' it gives text without HTML tags and it was easier to use split('\n')[0] to get first line with

I'm not usually active on English Wikipedia. Please refer to my [[mw:User:Catrope|user page]] at [[mw:|MediaWiki.org]].

EDIT: It doesn't have method to get all pages in one request. You have to use for-loop with 'page': 'USER:{}'.format(name)

import requests

for name in ['Catrope', 'Barek']:
    print('name:', name)

    params = {
        'action': 'parse',

        'page': 'USER:{}'.format(name),  # create page name

    #    'prop': 'text',
        'prop': 'wikitext',   
        'formatversion': 2,
        'format': 'json'
    }

    r = requests.get("https://en.wikipedia.org/w/api.php", params=params)
    data = r.json()

    #print(data['parse']['text'])
    print(data['parse']['wikitext'])
    print('---')

EDIT: For query revisions you have to use full titles

 "titles": "User:Catrope|User:Bob|User:Barek",

But not titles gives results so you have to check if there is revisions in data

import requests

S = requests.Session()

URL = "https://www.mediawiki.org/w/api.php"

PARAMS = {
    "action": "query",
    "prop": "revisions",
    "titles": "User:Catrope|User:Bob|User:Barek",
    "rvprop": "timestamp|user|comment|content",
    "rvslots": "main",
    "formatversion": "2",
    "format": "json"
}

R = S.get(url=URL, params=PARAMS)
DATA = R.json()

PAGES = DATA["query"]["pages"]

for page in PAGES:
    if "revisions" in page:
        for rev in page["revisions"]:
            print(rev['slots']['main']['content'])
    else:
        print(page)