Search code examples
pythonpython-2.7urlparse

Split the title part of the URL into a separate column - Python


Suppose I have a URL as follows:

http://sitename.com/pathname?title=moviename&url=VIDEO_URL

I want to parse this URL to get the title part and url part alone separately.

I tried the following,

from urlparse import urlparse
q = urlparse('http://sitename.com/pathname?title=moviename&url=VIDEO_URL')

After I do this, I get the following result,

q
ParseResult(scheme='http', netloc='sitename.com', path='/pathname', params='', query='title=moviename&url=VIDEO_URL', fragment='')

and q.query has,

'title=moviename&url=VIDEO_URL'

I am not able to use q.query.title or q.query.url here. Is there a way I can access this? I would like to split the url and title part separately into separate columns. Can we do it this way or can we write a substring method which would check for starting with "title" and ending with "&" and split it?

Thanks


Solution

  • You can use urlparse.parse_qs here to make a dictionary of parameters.

    from urlparse import urlparse, parse_qs
    q = urlparse('http://sitename.com/pathname?title=moviename&url=VIDEO_URL')
    qs = parse_qs(q.query)
    print qs["title"] # moviename
    print qs["url"] # VIDEO_URL
    

    This is the most reliable way to parse a URL's parameters: much better than split.