Search code examples
pythonpandasdataframekeyerrorpandas-loc

Python3 pandas loc matching, It is not recognized. (KeyError)


there is df,

   Remarks  Unnamed: 13  Unnamed: 14                 Unnamed: 15  
0   ttttttt            3        3.333  =10000/(10000-(h2+i2)*100)   
1   ttttttt            3        3.300                               
2   ttttttt            3        3.333   

and kwargs

kwargs = {'Unnamed: 13': '3',
          'Unnamed: 15': ''}

there's no problem with this.

print(df[(df["Unnamed: 13"] == "3") & (df["Unnamed: 15"] == "")])

and result

 Remarks  Unnamed: 13  Unnamed: 14 Unnamed: 15                       _id  \
1   ttttttt            3        3.300              5ae21c969268ff4118df7f8b   
2   ttttttt            3        3.333              5ae21c969268ff4118df7f8c   

I made this expression.

find_key_and_val =str(' & '.join(["(df["+"\""+key+"\"" + "] == " + "\"" + val + "\")" for key, val in kwargs.items()]))

print(find_key_and_val)

(df["Unnamed: 13"] == "3") & (df["Unnamed: 15"] == "")

Then I applied it.

print(df[find_key_and_val])

This will result in the following error:

Traceback (most recent call last):
  File "C:\Users\tlsdy\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\indexes\base.py", line 2525, in get_loc
    return self._engine.get_loc(key)
  File "pandas\_libs\index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: '(df["Unnamed: 13"] == "3") & (df["Unnamed: 15"] == "")'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:/python_project_ab/amazon/property.py", line 201, in <module>
    print(df[find_key_and_val])
  File "C:\Users\tlsdy\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\frame.py", line 2139, in __getitem__
    return self._getitem_column(key)
  File "C:\Users\tlsdy\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\frame.py", line 2146, in _getitem_column
    return self._get_item_cache(key)
  File "C:\Users\tlsdy\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\generic.py", line 1842, in _get_item_cache
    values = self._data.get(item)
  File "C:\Users\tlsdy\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\internals.py", line 3843, in get
    loc = self.items.get_loc(item)
  File "C:\Users\tlsdy\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\indexes\base.py", line 2527, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\_libs\index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: '(df["Unnamed: 13"] == "3") & (df["Unnamed: 15"] == "")'

                 

KeyError: '(df["Unnamed: 13"] == "3") & (df["Unnamed: 15"] == "")'

What should I do?


Solution

  • You're trying to get Pandas to do a complex operation involving a series of dataframe slices, but what you're giving it is actually just a string of characters.

    This expression is a series of slice operations on a dataframe:

    print(df[(df["Unnamed: 13"] == "3") & (df["Unnamed: 15"] == "")])
    

    While this expression is a bunch of characters:

    str(' & '.join(["(df["+"\""+key+"\"" + "] == " + "\"" + val + "\")" for key, val in kwargs.items()]))
    

    Your dataframe is literally looking for a key with the label of "(df["Unnamed: 13"] == "3") & (df["Unnamed: 15"] == "")", which doesn't exist. The compiler's not going to come across this mid-program and think "Oh, he means this to be code he wants executed". It's just going to see it as a string like any other.

    If you want to execute a string as a command, you can use the eval() method. For example:

    import pandas as pd
    
    data = [
        ['ttttttt', 3, 3.333, 10.0],
        ['ttttttt', 3, 3.300, ""],
        ['ttttttt', 3, 3.333, ""],
    ]
    
    df = pd.DataFrame(data, columns=['Remarks', 'Unnamed: 13', 'Unnamed: 14', 'Unnamed: 15'])
    string_query = """df[(df['Unnamed: 13'] == 3) & (df["Unnamed: 15"] == "")]"""
    
    print(eval(string_query))
    

    output:

       Remarks  Unnamed: 13  Unnamed: 14 Unnamed: 15
    1  ttttttt            3        3.300            
    2  ttttttt            3        3.333