I am trying to loop over a dataframe in order to compare the i and i+1 index as follows :
d = {'col1': [1, 2,0,55,12,1, 3,1,56,13], 'col2': [3,4,44,34,46,2,3,43,35,47], 'col3': ['A','A','A','B','B','A','B','B','B','B'] }
df = pd.DataFrame(data=d)
df
for index, row in df.iterrows():
if df.at[index,"col3"] != df.at[index+1,"col3"]:
print('True')
else:
print("false")
I get this error :
false
false
True
false
True
True
false
false
false
KeyError Traceback (most recent call last) in () 3 4 for index, row in df.iterrows(): ----> 5 if df.at[index,"col3"] != df.at[index+1,"col3"]: 6 print('True') 7 else:
in getitem(self, key) 2140 2141 key = self._convert_key(key) -> 2142 return self.obj._get_value(*key, takeable=self._takeable) 2143 2144 def setitem(self, key, value):
2538 try: -> 2539 return engine.get_value(series._values, index) 2540 except (TypeError, ValueError): 2541 pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value() pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value() pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item() KeyError: 10
Your code will always fail in the last row, because you are trying to get the line after the end.
Generally, when doing this kind of iteration where two lists of different sizes are used, the zip
function is the best solution:
for this_row, next_row in zip(df["col3"], df["col3"][1:]):
if this_row != next_row:
print('True')
else:
print("false")
Note that this code does not throw an exception even if you data frame has only one element.
If prefer to use index for iterating, an alternative option is:
for this_index, next_index in zip(df.index, df.index[1:]):
if df.at[this_index,"col3"] != df.at[next_index,"col3"]:
print('True')
else:
print("false")