split multiple urls using urlparse in python

I have a string with multiple urls extracted using BeautifulSoup and I want to split all of these urls to extract dates and year (the urls have dates and year in them).

print(dat)
http://www.foo.com/2016/01/0124
http://www.foo.com/2016/02/0122
http://www.foo.com/2016/02/0426
http://www.foo.com/2016/03/0129
.
.

I tried the following but it only retrieves the first:

import urlparse
parsed = urlparse(dat)
path = parsed[2] #defining after www.foo.com/
pathlist = path.split("/")

['', '2016', '01', '0124']

So I am only getting result for the first element of the string. How can I retrieve these parses for all of the urls, and store them so I can extract information? I would like know how many of the links there are for year and month.

Also strangely after doing this, when I do print(dat) I only get the first element http://www.foo.com/2016/01/0124, it seems that urlparse is not working for multiple urls.

Solution

Based on your question, it looks like you have a list of URLs separated by new lines. In that case you can use a for loop to iterate over them:

list_pathlist = []
for url in dat.split('\n'):
    parsed = urlparse(url)
    path = parsed[2] #defining after www.foo.com/
    pathlist = path.split("/")
    list_pathlist.append(pathlist)

In which case I suspect the result (list_pathlist) will be something like:

[['', '2016', '01', '0124'],['', '2016', '02', '1222'],...]

so a list of lists.

Or you can put it into a nice one-liner using list-comprehension:

list_pathlist = [urlparse(url)[2].split('/') for url in dat.split('\n')]