Search code examples
jsonapache-spark-sqltimestampspark-streamingunix-timestamp

from_unixtime function not giving correct output in spark-sql


Note - I have looked over the internet, all the solutions are present in either python or scala but not in java.

I am reading data from a json file. In the file the time is given in unixtime format. I am converting it to a date-time format.

The json file data looks like this -

"""[{"Arrival_Time":1424686735175,"Creation_Time":1424686733176178965,"Device":"nexus4_1","Index":35,"Model":"nexus4","User":"g","gt":"stand","x":0.0014038086,"y":5.0354E-4,"z":-0.0124053955}
{....}
{....}
]"""

my code is -

StructType activitySchema = new StructType().add("Arrival_Time", "BIGINT")
                .add("Creation_Time", "BIGINT")
                .add("Device","string")
                .add("Index", "string")
                .add("Model","string")
                .add("User", "string")
                .add("gt","string")
                .add("x", "DOUBLE")
                .add("y","DOUBLE")
                .add("z","DOUBLE");

        Dataset<Row> jsondf = spark
                .readStream()
                .schema(activitySchema)
                .json("file:///Users/anuragharsh/Desktop/Data/Activity_Data/")
                .select(
                        from_unixtime(col("Arrival_Time"),"MM-dd-yyyy HH:mm:ss").as("timestamp_1"),
                        from_unixtime(col("Creation_Time"),"MM-dd-yyyy HH:mm:ss").as("timestamp_2")
                );

Still the data is getting printed in this format -

Output

How do I convert it to date-time format ?


Solution

  • I got the solution, Arrival_Time and Creation_Time need to be in milliseconds for the from_unixtime function to work. I converted them to milliseconds and it worked like a charm :)