Search code examples
pythonpandassplitpandasql

Python, pandas How to split a string by finding a specific word rather then "," or "_" and etc


I'm having hard time trying to extract an id number from a string.

I could get it using index but it would fail for the other rows of the data-frame.

How do I extract campaignid=351154190, in a such way that would work for all rows.

only pattern is the word campaignid, need extract and store in new column in the data-frame. Performance is not crucial in this task.

Original string

https:_utm_source=googlebrand&utm_medium=ppc&utm_campaign=brand&utm_campaignid=3
51154190&keyword=aihdisadjiajdutm_matchtype=e&device=m&utm_network=g&utm_adposit
ion=1t1&geo=9027258&gclid=CjwKCsadjjsaopdl[psdklksfdosjfidj9FOk033DKW1xoCXlwQAvD
_BwE&affiliate_id=asdaskdosjadiasjdisaj-asdhasuigdyusagdyusagyk033DKW1xoCXlwQAvD_BwE&utm_content=search&utm_contentid=1251489456158180&placement&extension

Spliting the string

x= cw.captureurl.str.split('&').str[:-1]

printing one row

print(x[25])

['https:_utm_source=googlebrand', 'utm_medium=ppc', 'utm_campaign=brand', 
'utm_campaignid=35119190', 'keyword=co',
 'utm_matchtype=e', 'device=m', 'utm_network=g', 'utm_adposition=1t1',
 'geo=9027258', 'gclid=CjwKCAjwnMTqBRAzEiwAEF3ndo3-
CNOsp1VT5OIxm0BuUcSWQEwtJSR5KLiJzrvjjc9FOk033DKW1xoCXlwQAvD_BwE',
 'affiliate_id=CjwKCAjwnMTqBRAzEiwAEF3ndo3-
CNOsp1VT5OIxm0BuUcSWQEwtJSR5KLiJzrvjjc9FOk033DKW1xoCXlwQAvD_BwE', 
'utm_content=search', 'utm_contentid=1211732930', 'placement']

It would be great if I could use something that would search for the word "campaignid" (what is my target)

Then store it in another column of the some data-frame.

I tried doing a split after split, it didn't work I tried using for loop with if statement, didn't work also.


Solution

  • Use regex:

    campaign_id = cw['captureurl'].str.extract('campaignid=(\\d+)')[0]