Search code examples
pythonsplitkeyerror

Python: KeyError When Searching by a Value in One Column in Another Column That Has The Values Contained Between # Signs


I am trying to find if values in one column (Column_A) are contained in another column (Column_B) that have values that are separated by # signs. Then I would like to return the matching value in a new column (Column_C).

The data looks like this: enter image description here

I have tried this:

def find_matching_value(row):
    a_values=[float(x) for x in row[my_df1.Column_B].split('#') if x !='']
    if row[my_df1.Column_A] in a_values:
        return row['Column_A']
    else:
        return None

my_df2['Column_C']=my_df1.apply(find_matching_value, axis=1)

However, I get a key error:

KeyError: "None of [Index(['#16.990000#16.990000#16.990000#28.990000#28.990000#16.99#16.99#16.99#21.99#21.99#21.99,\n'#13.490000#13.490000#13.990000#11.99#11.99#12.49', \n'#13.490000#13.390000#14.490000#16.990000#16.990000#17.490000#12.49#12.49#12.99#15.99#15.99#15.99'dtype='object', length=3)] are in the [index]"

Not sure what is causing the issue here.


Solution

  • There is no need to use apply here, you can slighty ajudst your listcomp :

    my_df1["Column_C"] = [a if any([b == str(a) for b in lst]) else None
                          for a, lst in zip(my_df1["Column_A"],
                                            my_df1["Column_B"].str.split("#"))]  
    

    Output :

    print(my_df1)
    ​
       Column_A                                                                                Column_B Column_C
    0     16.99  #16.990000#16.990000#16.990000#28.990000#28.990000#16.99#16.99#16.99#21.99#21.99#21.99    16.99
    1     11.99                                                    #13.49#13.49#13.99#11.99#11.99#12.49    11.99
    2     12.49                #13.49#13.39#14.49#16.99#16.99#17.49#12.49#12.49#12.99#15.99#15.99#15.99    12.49