I am trying to extract just the ICID bit from my URL:
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
So what I am after really is:
ICID=secondary_pricing_goldplus_cust_paymentpage_anon
I am trying to do the following but obviously, it doesn't work (any help would be highly appreciated. This ICID bit could be anywhere in the URL - beginning, middle or end:
from urllib.parse import urlparse
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
obj = urlparse(url)
print(obj)
query = obj.query
print (query)
path_list = query.split("ICID(.+?)&")
print (path_list)
You're half way there- using urlparse
is a right first step, but then you want to use parse_qs
(also from urllib.parse
) to parse the query:
from urllib.parse import urlparse, parse_qs
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
query = urlparse(url).query
path_list = parse_qs(query)['ICID']
Output:
>>> print(path_list)
['secondary_pricing_goldplus_cust_paymentpage_anon']