Search code examples
pythonpandasindexingkeyerror

Trying to get the i+1 index on pandas dataframe failling


I am trying to loop over a dataframe in order to compare the i and i+1 index as follows :

d = {'col1': [1, 2,0,55,12,1, 3,1,56,13], 'col2': [3,4,44,34,46,2,3,43,35,47], 'col3': ['A','A','A','B','B','A','B','B','B','B'] } 
df = pd.DataFrame(data=d)
df

for index, row in df.iterrows():
    if df.at[index,"col3"] != df.at[index+1,"col3"]:
        print('True')
    else:
        print("false")

I get this error :

false
false
True
false
True
True
false
false
false

KeyError Traceback (most recent call last) in () 3 4 for index, row in df.iterrows(): ----> 5 if df.at[index,"col3"] != df.at[index+1,"col3"]: 6 print('True') 7 else:

in getitem(self, key) 2140 2141 key = self._convert_key(key) -> 2142 return self.obj._get_value(*key, takeable=self._takeable) 2143 2144 def setitem(self, key, value):

   2538         try:
-> 2539             return engine.get_value(series._values, index)
   2540         except (TypeError, ValueError):
   2541 

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 10

Solution

  • Your code will always fail in the last row, because you are trying to get the line after the end.

    Generally, when doing this kind of iteration where two lists of different sizes are used, the zip function is the best solution:

    for this_row, next_row in zip(df["col3"], df["col3"][1:]):
        if this_row != next_row:
            print('True')
        else:
            print("false")
    

    Note that this code does not throw an exception even if you data frame has only one element.

    If prefer to use index for iterating, an alternative option is:

    for this_index, next_index in zip(df.index, df.index[1:]):
        if df.at[this_index,"col3"] != df.at[next_index,"col3"]:
            print('True')
        else:
            print("false")