Search code examples
pythonurlparseurlsplit

Splitting a url into a list in python


I am currently working on a project that involves splitting a url. I have used the urlparse module to break up the url, so now I am working with just the path segment.

The problem is that when I try to split() the string based on the delimiter "/" to separate the directories, I end up with empty strings in my list.

For example, when I do the following:

import urlparse
url = "http://example/url/being/used/to/show/problem"
parsed = urlparse.urlparse(url)
path = parsed[2] #this is the path element

pathlist = path.split("/")

I get the list:

['', 'url', 'being', 'used', 'to', 'show', 'problem']

I do not want these empty strings. I realize that I can remove them by making a new list without them, but that seems sloppy. Is there a better way to remove the empty strings and slashes?


Solution

  • I am not familiar with urllib and its output for path but think that one way to form new list you can use list comprehension the following way:

    [x for x in path.split("/") if x]
    

    Or something like this if only leading '/':

    path.lstrip('/').split("/")
    

    Else if trailing too:

    path.strip('/').split("/")
    

    And at least if your string in path always starting from single '/' than the easiest way is:

    path[1:].split('/')