I am learning Python (Pandas) and working with the Pokemon DB (following Youtube tutorial) I am currently learning .loc and query methods so that I can access data in different ways. I encounter an error while working with df.query.
Please find my information below:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv(r"C:\XXXX\pokemon.csv")
When I try to access the Pokemons that are type 1 = fire and type 2= flying using .loc it works and I have no errors as shown below:
df.loc[(df['Type 1'] == "Fire") & (df['Type 2'] == "Flying")]
But, when I try to do the same using .query I get an error
df.query("'Type 1' == 'Fire' and 'Type 2' == 'Flying'")
The error is the following:
KeyError Traceback (most recent call last)
Cell In[30], line 3
1 #if we want to bring their rows to see their names. We need to be aware that it is 6 Pokemons
2 #df.loc[(df['Type 1'] == "Fire") & (df['Type 2'] == "Flying")]
----> 3 df.query("'Type 1' == 'Fire' and 'Type 2' == 'Flying'")
File ~\anaconda3\envs\env1\lib\site-packages\pandas\core\frame.py:4605, in DataFrame.query(self, expr, inplace, **kwargs)
4602 res = self.eval(expr, **kwargs)
4604 try:
-> 4605 result = self.loc[res]
4606 except ValueError:
4607 # when res is multi-dimensional loc raises, but this is sometimes a
4608 # valid query
4609 result = self[res]
File ~\anaconda3\envs\env1\lib\site-packages\pandas\core\indexing.py:1153, in _LocationIndexer.__getitem__(self, key)
1150 axis = self.axis or 0
1152 maybe_callable = com.apply_if_callable(key, self.obj)
-> 1153 return self._getitem_axis(maybe_callable, axis=axis)
File ~\anaconda3\envs\env1\lib\site-packages\pandas\core\indexing.py:1393, in _LocIndexer._getitem_axis(self, key, axis)
1391 # fall thru to straight lookup
1392 self._validate_key(key, axis)
-> 1393 return self._get_label(key, axis=axis)
File ~\anaconda3\envs\env1\lib\site-packages\pandas\core\indexing.py:1343, in _LocIndexer._get_label(self, label, axis)
1341 def _get_label(self, label, axis: AxisInt):
1342 # GH#5567 this will fail if the label is not present in the axis.
-> 1343 return self.obj.xs(label, axis=axis)
File ~\anaconda3\envs\env1\lib\site-packages\pandas\core\generic.py:4236, in NDFrame.xs(self, key, axis, level, drop_level)
4234 new_index = index[loc]
4235 else:
-> 4236 loc = index.get_loc(key)
4238 if isinstance(loc, np.ndarray):
4239 if loc.dtype == np.bool_:
File ~\anaconda3\envs\env1\lib\site-packages\pandas\core\indexes\range.py:418, in RangeIndex.get_loc(self, key)
416 raise KeyError(key) from err
417 if isinstance(key, Hashable):
--> 418 raise KeyError(key)
419 self._check_indexing_error(key)
420 raise KeyError(key)
KeyError: False
I searched for the error, and I understand that KeyError happens when I am calling the wrong key or unexisting key. Why I am getting this issue when trying to retrieve Keys that I already use before?
**What I have tried until now? **
df.query("Speed <= 10")
I tried different syntaxes for my query thinking that maybe the "" and '' were wrong, but it did not work. I checked the documentation for that.
I checked if maybe the names are wrong and verify the names of my columns with:
print(df.columns.tolist())
I restarted my kernels and loaded my file again, thinking I moved or did something to change some settings.
Updated my libraries in Anaconda just in case is something related to that
I haven't tried the exceptions yet, as I am very curious why the video I am following has no issues
Please let me know if you have any ideas,
Thank you in advance
You have to use backticks because Type 1
and Type 2
are not valid Python variable names:
>>> df.query("`Type 1` == 'Fire' and `Type 2` == 'Flying'")
Name Type 1 Type 2 Total HP Attack Defense Sp. Atk Sp. Def Speed Generation Legendary
#
6 Charizard Fire Flying 534 78 84 78 109 85 100 1 False
6 CharizardMega Charizard Y Fire Flying 634 78 104 78 159 115 100 1 False
146 Moltres Fire Flying 580 90 100 90 125 85 90 1 True
250 Ho-oh Fire Flying 680 106 130 90 110 154 90 2 True
662 Fletchinder Fire Flying 382 62 73 55 56 52 84 6 False
663 Talonflame Fire Flying 499 78 81 71 74 69 126 6 False
From documentation:
You can refer to column names that are not valid Python variable names by surrounding them in backticks.