I'd love to create a new timestamp column on a dataframe using a date column and a string column
Date | Times (Sting) | desired column |
---|---|---|
2020-11-03 | 15:34:02 | 2020-11-03 15:34:02 |
i'm trying something like that in the select statement but i'm having an error. Can anyone help?
F.to_timestamp(F.concat_ws('', F.col("Date"), F.col("Time"), 'yyyy-MM-dd HH:mm:ss')).alias("desired_column")
You can simply do something like this by using pyspark
functions:
import pyspark
from pyspark.sql import functions as sf
sc = pyspark.SparkContext()
sqlc = pyspark.SQLContext(sc)
# note this i used to create the data frame
df = sqlc.createDataFrame([('2020-11-03','15:34:02')], ['Date', 'Times (Sting)'])
print(df.show())
df = df.withColumn('desired column',sf.concat(sf.col('Date'),sf.lit(' '), sf.col('Times (Sting)')))
print(df.show())