Search code examples
pythonpandasmergekeyerror

raise KeyError(key) from err KeyError: 'right_only' / 'both'


I try to develop python comparer. I got issue with my variable named: compare_type which i want to set either to l-left join / r - right join / b - inner join (both)

if i set compare_type = 'l'; all works fine

Nevertheless when i do either compare_type = 'r'; or compare_type = 'b'; ten i got following errors:

   raise KeyError(key) from err
KeyError: 'both'

or

    raise KeyError(key) from err
KeyError: 'right_only'

What am i doing wrong?

Complete code:

import pandas as pd

col_to_compare = 0;
compare_type = 'r';     #l-left join / r - right join / b - inner join (both)

file1_df = pd.read_csv('filename1.csv', usecols=[col_to_compare], names=[col_to_compare])
file2_df = pd.read_csv('filename2.csv', usecols=[col_to_compare], names=[col_to_compare])

file1_df[col_to_compare] = file1_df[col_to_compare].str.upper()
file2_df[col_to_compare] = file2_df[col_to_compare].str.upper()

comparison_result = pd.merge(file1_df, file2_df, on=col_to_compare,
                             how='left' if (compare_type == 'l') else 'right' if (compare_type == 'r') else 'inner',
                             indicator=True)

comparison_result = comparison_result.loc[comparison_result['_merge'] == 'left_only' if (compare_type == 'l') else 'right_only' if (compare_type == 'r') else 'both']

print(comparison_result)
comparison_result.to_csv('result.csv')

Full traceback:

C:\Users\john\PycharmProjects\L1\venv\Scripts\python.exe C:/Users/john/PycharmProjects/L1/CsvComparer/csv_comparer.py
Traceback (most recent call last):
  File "C:\Users\john\PycharmProjects\L1\venv\lib\site-packages\pandas\core\indexes\base.py", line 3361, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas\_libs\index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index_class_helper.pxi", line 105, in pandas._libs.index.Int64Engine._check_type
  File "pandas\_libs\index_class_helper.pxi", line 105, in pandas._libs.index.Int64Engine._check_type
KeyError: 'right_only'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:/Users/john/PycharmProjects/L1/CsvComparer/csv_comparer.py", line 29, in <module>
    comparison_result = comparison_result.loc[comparison_result['_merge'] == 'left_only' if (compare_type == 'l') else 'right_only' if (compare_type == 'r') else 'both']
  File "C:\Users\john\PycharmProjects\L1\venv\lib\site-packages\pandas\core\indexing.py", line 931, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
  File "C:\Users\john\PycharmProjects\L1\venv\lib\site-packages\pandas\core\indexing.py", line 1164, in _getitem_axis
    return self._get_label(key, axis=axis)
  File "C:\Users\john\PycharmProjects\L1\venv\lib\site-packages\pandas\core\indexing.py", line 1113, in _get_label
    return self.obj.xs(label, axis=axis)
  File "C:\Users\john\PycharmProjects\L1\venv\lib\site-packages\pandas\core\generic.py", line 3776, in xs
    loc = index.get_loc(key)
  File "C:\Users\john\PycharmProjects\L1\venv\lib\site-packages\pandas\core\indexes\base.py", line 3363, in get_loc
    raise KeyError(key) from err
KeyError: 'right_only'

Process finished with exit code 1

Solution

  • You have issue with this following line :

    comparison_result['_merge'] == 'left_only' if (compare_type == 'l') else 'right_only' if (compare_type == 'r') else 'both'
    

    My Working Code:

    import pandas as pd
    
    col_to_compare = '0';
    compare_type = 'r';     #l-left join / r - right join / b - inner join (both)
    
    # file1_df = pd.read_csv('filename1.csv', usecols=[col_to_compare], names=[col_to_compare])
    # file2_df = pd.read_csv('filename2.csv', usecols=[col_to_compare], names=[col_to_compare])
    file1_df = pd.DataFrame(
        {
            "0": ["K0", "K1", "K2", "K3"],
            "1": ["A0", "A1", "A2", "A3"],
            "2": ["B0", "B1", "B2", "B3"],
        }
    )
    
    file2_df = pd.DataFrame(
        {
            "0": ["K1", "K2", "K3", "K4"],
            "3": ["C0", "C1", "C2", "C3"],
            "4": ["D0", "D1", "D2", "D3"],
        }
    )
    
    
    file1_df[col_to_compare] = file1_df[col_to_compare].str.upper()
    file2_df[col_to_compare] = file2_df[col_to_compare].str.upper()
    
    comparison_result = pd.merge(file1_df, file2_df, on = col_to_compare, how = ('left' if (compare_type == 'l') else 'right' if (compare_type == 'r') else 'inner'), indicator = True)
    
    print(f'{comparison_result}\n')
    
    comparison_result = comparison_result.loc[comparison_result['_merge'] == ('left_only' if (compare_type == 'l') else ('right_only' if (compare_type == 'r') else 'both'))]
    
    print(f'{comparison_result}')
    # comparison_result.to_csv('result.csv')
    
    

    Output for compare_type = 'r':

    right_only

    Output for compare_type = 'l':

    left_only

    Output for compare_type = 'b':

    both

    Notes:

    I made some minor changes to debug the issue so you can avoid those.
    Avoid condition string rendering in DataFrame