Search code examples
pythonurlparametersmechanizemechanize-python

Python Mechanize, how to get URL parameters


I'm currently working on website scraper. Because I have to log in to access the website, a session ID has to be generated and saved for further usage.

The session ID is at the end of the URL.

https://example.com/something.php?sid=123456789

I tried using the geturl() command but it only returns the URL without any parameters.

What would be the best way to get the url parameters?


Solution

  • from urllib.parse import urlparse
    
    parsed = urlparse(url)
    print(parsed)
    

    The output:

    ParseResult(scheme='https', netloc='example.com', path='/something.php', params='', query='sid=123456789', fragment='')
    

    Then, you can access:

    print(parsed.query)
    

    The output:

    sid=123456789
    

    Then, you can extract:

    sid = parsed.query.split('sid=')[-1]
    print(sid)
    

    The output:

    123456789