Search code examples

Convert pyspark.sql.dataframe.DataFrame type Dataframe to Dictionary

I have a pyspark Dataframe and I need to convert this into python dictionary.

Below code is reproducible:

from pyspark.sql import Row
rdd = sc.parallelize([Row(name='Alice', age=5, height=80),Row(name='Alice', age=5, height=80),Row(name='Alice', age=10, height=80)])
df = rdd.toDF()

Once I have this dataframe, I need to convert it into dictionary.

I tried like this


But it gives error. How can I achieve this


  • You need to first convert to a pandas.DataFrame using toPandas(), then you can use the to_dict() method on the transposed dataframe with orient='list':

    # Out[1]: {u'Alice': [10, 80]}