Search code examples
pythondataframesplitindex-error

Split values in Data frame columns


I have a Data Frame name df and I want to remove this '|' in fuel column

id  car       fuel
1   Mercedes  petrol|diesel|gas
2   Audi      gas|petrol   

So that my data look like this

id  car        fuel
1   Mercedes   petrol
1   Mercedes   diesel
1   Mercedes   gas
2   Audi       gas
2   Audi       petrol

This is the code that I tried

df_1 = hb.copy()
df_2 = hb.copy()
df_3 = hb.copy()

df_1['fuel'] = df_1['fuel'].apply(lambda x:x.split('|')[0])
df_2['fuel'] = df_2['fuel'].apply(lambda x:x.split('|')[1])
df_3['fuel'] = df_3['fuel'].apply(lambda x:x.split('|')[2])

And this give IndexError: list index out of range


Solution

  • Try this:

        df=pd.DataFrame({'car':['Mercedes','Audi'],'fuel':['petrol|diesel|gas','gas|petrol']}) #your dataframe
        df2=pd.DataFrame()                                       #new black dataframe
        for i in range(0,len(df)):                               #iterating over df
            list1=df.iloc[i,1].split('|')                        #split each value of 'fuel' and store it in a list
            for j in range(0,len(list1)):                        #iterating over list1
                list2={'car':df.iloc[i,0],'fuel':list1[j]}       #make a dict of each combination of 'car' and elements of list1-'fuel'
                df2=df2.append(list2,ignore_index=True)          #append each value to the blank df