How to get specific part of any url using urlparse()?

I have an url like this

url = 'https://grabagun.com/firearms/handguns/semi-automatic-handguns/glock-19-gen-5-polished-nickel-9mm-4-02-inch-barrel-15-rounds-exclusive.html'

When I use urlparse() function, I am getting result like this:

>>> url = urlparse(url) 
>>> url.path
'/firearms/handguns/semi-automatic-handguns/glock-19-gen-5-polished-nickel-9mm-4-02-inch-barrel-15-rounds-exclusive.html'

Is it possible to get something like this:

path1 = "firearms"
path2 = "handguns"
path3 = "semi-automatic-handguns"

and I don't want to get any text which have ".html" at the end.

Solution

You have some single / and some path have //...first replace all with same if you want apply directly on URL. For url.path you can do it directly

url = '/firearms/handguns/semi-automatic-handguns/glock-19-gen-5-polished-nickel-9mm-4-02-inch-barrel-15-rounds-exclusive.html'

url = url.split('/')
url = list(filter(None, url))#remove empty elemnt
url.pop()
print(url)

output list #

['firearms', 'handguns', 'semi-automatic-handguns']

Part 2

If you want to make them variables then simply iterate over them and create variables

for n, val in enumerate(url):
    globals()["path%d"%n] = val

print(path1)

Output:

handguns