let's say I have a column like the below
Date |
---|
03/2024 |
07/2024 |
12/2024 |
06/2024 |
01/2024 |
but I want to change the string order and remove a specific character in the middle
Date |
---|
202403 |
202407 |
202412 |
202406 |
202401 |
Please help me!
If it is not normal Date then you can deal with it as string
import pyspark.sql.functions as f
from pyspark.sql.types import StringType
df = spark.createDataFrame(
[("03/2024"),
("07/2024"),
("12/2024"),
("06/2024"),
("01/2024")],
StringType())
df = df.withColumn("Split", f.split(df.value,"/"))
df.withColumn("Ordered", f.concat(df.Split[1], df.Split[0])).show()
+-------+----------+-------+
| value| Split|Ordered|
+-------+----------+-------+
|03/2024|[03, 2024]| 202403|
|07/2024|[07, 2024]| 202407|
|12/2024|[12, 2024]| 202412|
|06/2024|[06, 2024]| 202406|
|01/2024|[01, 2024]| 202401|
+-------+----------+-------+