Search code examples
pythonpython-3.xpysparkapache-spark-sql

PySpark: Read the csv data in pyspark frame. Why does it show special characters in frame? Any way to show in a tabular form except using pandas


I am reading the CSV file using pyspark. After reading the CSV into the pyspark dataframe it shows me as I have special characters in my header while displaying the data on jupyter notebook. Can anyone please guide me on how can I display data without seeing these special characters? Moreover the data is not aligned as you can see in the picture, how can I display data in the tabular form not like this (without using pandas)

py_df = spark.read.option('header', 'true').csv("E:\Data files\Amazon e-commerce data.csv")

enter image description here


Solution

  • Just try truncate = False in your show()

    py_df = spark.read.option('header', 'true').csv("E:\Data files\Amazon e-commerce data.csv").show(truncate=False)
    

    It will show only 20 rows, if you want to see more rows put n=1000 for 1000 rows in show()