How do I get specific path sections from a url? For example, I want a function which operates on this:
http://www.mydomain.com/hithere?image=2934
and returns "hithere"
or operates on this:
http://www.mydomain.com/hithere/something/else
and returns the same thing ("hithere")
I know this will probably use urllib or urllib2 but I can't figure out from the docs how to get only a section of the path.
Extract the path component of the URL with urlparse (Python 2.7):
import urlparse
path = urlparse.urlparse('http://www.example.com/hithere/something/else').path
print path
> '/hithere/something/else'
or urllib.parse (Python 3):
import urllib.parse
path = urllib.parse.urlparse('http://www.example.com/hithere/something/else').path
print(path)
> '/hithere/something/else'
Split the path into components with os.path.split:
>>> import os.path
>>> os.path.split(path)
('/hithere/something', 'else')
The dirname and basename functions give you the two pieces of the split; perhaps use dirname in a while loop:
>>> while os.path.dirname(path) != '/':
... path = os.path.dirname(path)
...
>>> path
'/hithere'