Search code examples
pythonpython-3.xurllib

How to deal with `;` with `urllib.parse.parse_qsl()`?


; can not be dealt by parse_qsl(). Is there a way to make it aware of ;? Thanks.

>>> import urllib.parse
>>> urllib.parse.parse_qsl('http://example.com/?q=abc&p=1;2;3')
[('http://example.com/?q', 'abc'), ('p', '1')]

Solution

  • It would be best to make sure that the URLs you are dealing with have the semicolons URL encoded. e.g. http://example.com/?q=abc&p=1%3B2%3B3

    If for some reason you can't do the above, you could do something like this:

    from urllib.parse import urlparse, unquote_plus
    
    url = "http://example.com/?q=abc&p=1;2;3"
    parts = urlparse(url)
    qs = parts.query
    pairs = [p.split("=", 1) for p in qs.split("&")]
    decoded = [(unquote_plus(k), unquote_plus(v)) for (k, v) in pairs]
    
    >>> decoded
    [('q', 'abc'), ('p', '1;2;3')]
    

    The above code assumes a few things about the query string. e.g. that all keys have values. If you want something that makes fewer assumptions, see the parse_qsl source code.