I have an url like this
url = 'https://grabagun.com/firearms/handguns/semi-automatic-handguns/glock-19-gen-5-polished-nickel-9mm-4-02-inch-barrel-15-rounds-exclusive.html'
When I use urlparse()
function, I am getting result like this:
>>> url = urlparse(url)
>>> url.path
'/firearms/handguns/semi-automatic-handguns/glock-19-gen-5-polished-nickel-9mm-4-02-inch-barrel-15-rounds-exclusive.html'
Is it possible to get something like this:
path1 = "firearms"
path2 = "handguns"
path3 = "semi-automatic-handguns"
and I don't want to get any text which have ".html" at the end.
You have some single /
and some path have //
...first replace all with same if you want apply directly on URL. For url.path
you can do it directly
url = '/firearms/handguns/semi-automatic-handguns/glock-19-gen-5-polished-nickel-9mm-4-02-inch-barrel-15-rounds-exclusive.html'
url = url.split('/')
url = list(filter(None, url))#remove empty elemnt
url.pop()
print(url)
output list #
['firearms', 'handguns', 'semi-automatic-handguns']
Part 2
If you want to make them variables then simply iterate over them and create variables
for n, val in enumerate(url):
globals()["path%d"%n] = val
print(path1)
Output:
handguns