Search code examples
pythonpandasstrsplitmultiple-conditions

Extract first word for each row in a column under multiple conditions


I have a dataset contains a column of string. it looks like

df.a=[['samsung/windows','mobile unknown','chrome/android']]. I am trying to obtain the first word of each row to replace the current string, e.g.[['samsung','mobile','chrome']]

I applied:

df.a=df.a.str.split().str.get(0)

this gives me the first word but with "/"

df.a=[words.split("/")[0] for words in df.a]

this only splits the strings that contains "/"

Can I get the expected result using one line?


Solution

  • use re.findall() and get only alpha numeric

    import re
    df['a'] = df['a'].apply(lambda x : re.findall(r"[\w']+",x)[0])