Search code examples
pythonpandasdataframememory

Difference between df.items() and df.iteritems()


Pandas documentation for df.items() says;

Iterate over (column name, Series) pairs.

The exact same definition can be found for df.iteritems() as well. Both seem to be doing the same thing.

However, I was curious whether there is any difference between these two, as there is between dict.items() and dict.iteritems() according to this SO question. Apparently, dict.items() created a real list of tuples (in python2) potentially taking a lot of memory while dict.iteritems() returns a generator.

Is this the case with df.items() and df.iteritems()? Is df.iteritems() faster for dataframes having a large number of columns?


Solution

  • They are exactly the same, there is no difference. You can see the source code of both here. The iteritems method is really this (except for the type hints and doc decorator):

    def iteritems(self):
        yield from self.items()
    

    Note that iteritems was a Python 2 reminiscence and it's deprecated in pandas 1.5 (2022 Sep) and removed in pandas 2 (2023 Apr).