The question is kind of similar with the problem: Change the timestamp to UTC format in Pyspark
Basically, it is convert timestamp string format ISO8601 with offset to UTC timestamp string(2017-08-01T14:30:00+05:30
-> 2017-08-01T09:00:00+00:00
) using scala.
I am kind of new to scala/java, I checked spark library which they dont have a way to convert without knowing the timezone, which I dont have a idea of timezone unless (I parse it in ugly way or using java/scala lib?) Can someone help?
UPDATE: The better way to do this: setup timezone session in spark, and use df.cast(DataTypes.TimestampType)
to do the timezone shift
You can use the java.time
primitives to parse and convert your timestamp.
scala> import java.time.{OffsetDateTime, ZoneOffset}
import java.time.{OffsetDateTime, ZoneOffset}
scala> val datetime = "2017-08-01T14:30:00+05:30"
datetime: String = 2017-08-01T14:30:00+05:30
scala> OffsetDateTime.parse(datetime).withOffsetSameInstant(ZoneOffset.UTC)
res44: java.time.OffsetDateTime = 2017-08-01T09:00Z