In Pyspark have dataset as below
|weekend_day|totals |
| 2023-02-25| 401943676|
| 2023-03-11| 410220150|
and the expected output is
| | 2023-02-25 | 2023-03-11 |
| totals | 401943676 | 410220150 |
pivot is not providing the result. Please advice how it can be achieved?
Please note I don't want to use Pandas
Thank you
Not sure what do you mean of pivot
is not providing the result?
df = spark.createDataFrame(
[('2023-02-25', 401943676), ('2023-03-11', 410220150)],
schema=['weekend_day', 'totals']
df.printSchema(), False)
|weekend_day|totals |
|2023-02-25 |401943676|
|2023-03-11 |410220150|
You can use groupBy
and pivot
to achieve the expected output:
from pyspark.sql import functions as func
10, False
|total |401943676 |410220150 |