I have a set of links that looks like the following:
links = ['http://www.website.com/category/subcategory/1',
'http://www.website.com/category/subcategory/2',
'http://www.website.com/category/subcategory/3',...]
I want to extract the 1
, 2
, 3
, and so on from this list, and store the extracted data in subcategory_explicit
. They're stored as str
, and I'm having trouble getting at them with the following code:
subcategory_explicit = [cat.get('subcategory') for cat in links if cat.get('subcategory') is not None]
Do I have to change my data type from str
to something else? What would be a better way to obtain and store the extracted values?
subcategory_explicit = [i[i.find('subcategory'):] for i in links if 'subcategory' in i]
This uses a substring via slicing, starting at the "s" in "subcategory" until the end of the string. By adding len('subcategory')
to the value from find
, you can exclude "subcategory" and get "/#" (where # is whatever number).