Search code examples
pandaspysparkipythonjupyter-notebookapache-spark-sql

pyspark show dataframe as table with horizontal scroll in ipython notebook


a pyspark.sql.DataFrame displays messy with DataFrame.show() - lines wrap instead of a scroll.

enter image description here

but displays with pandas.DataFrame.head enter image description here

I tried these options

import IPython
IPython.auto_scroll_threshold = 9999

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
from IPython.display import display

but no luck. Although the scroll works when used within Atom editor with jupyter plugin:

enter image description here


Solution

  • this is a workaround

    spark_df.limit(5).toPandas().head()
    

    although, I do not know the computational burden of this query. I am thinking limit() is not expensive. corrections welcome.