I am trying to run a simple PySpark program to test.
Here is my code:
if __name__ == "__main__":
spark = SparkSession.builder \
.appName("Welcome Spark") \
.master("local[2]") \
.getOrCreate()
data_list = [("Aishwarya", 21),("Jhanavi", 19),("Maithree", 23),];
df = spark.createDataFrame(data_list).toDF("Name", "Age")
df.show()
I am trying to add the list to a dataframe. I am getting an error while creating the data frame.
data_list = [("Aishwarya", 21),("Jhanavi", 19),("Maithree", 23),];
df = spark.createDataFrame(data_list).toDF("Name", "Age")
df.show()
You can try below two methods, both works for me:
# option 1
data_list = [("Aishwarya", 21),("Jhanavi", 19),("Maithree", 23),]
new_dfdf = spark.createDataFrame(data_list).toDF("Name", "Age")
new_dfdf.show(3)
# option 2
op_dfdf = spark.createDataFrame(data_list, ("Name", "Age"))
op_dfdf.show(3)