I'm currently working on website scraper. Because I have to log in to access the website, a session ID
has to be generated and saved for further usage.
The session ID
is at the end of the URL.
https://example.com/something.php?sid=123456789
I tried using the geturl()
command but it only returns the URL without any parameters.
What would be the best way to get the url parameters?
from urllib.parse import urlparse
parsed = urlparse(url)
print(parsed)
The output:
ParseResult(scheme='https', netloc='example.com', path='/something.php', params='', query='sid=123456789', fragment='')
Then, you can access:
print(parsed.query)
The output:
sid=123456789
Then, you can extract:
sid = parsed.query.split('sid=')[-1]
print(sid)
The output:
123456789