Search code examples
pythonpython-3.xpandasdataframepyspark

AttributeError: 'DataFrame' object has no attribute 'iteritems'


I am using pandas to read csv on my machine then I create a pyspark dataframe from pandas dataframe.

df = spark.createDataFrame(pandas_df) 

I updated my pandas from version 1.3.0 to 2.0

Now, I am getting this error:

enter image description here


enter image description here

AttributeError: 'DataFrame' object has no attribute 'iteritems'

Solution

  • Found an answer on github: https://github.com/YosefLab/Compass/issues/92

    It is an issue going on.

    iteritems is removed from pandas 2.0

    For now I need to downgrade pandas back to version 1.5.3


    Edit:

    Other workarounds may be

    Use the latest Spark (3.4.1)

    https://spark.apache.org/downloads.html


    For pandas >=2.0

    You can also assign DataFrame.items to DataFrame.iteritems

    import pandas as pd
    pd.DataFrame.iteritems = pd.DataFrame.items
    

    https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.items.html?highlight=items#pandas.DataFrame.items