I dont know why this spark sql skips the last 3 000s after that decimal point
df = spark.sql("""
select cast('1900-01-01 00:00:00.000' as timestamp) as a from dbo.products
""")
df.show(1,False)
This above query produces 1900-01-01 00:00:00 It somehow misses the last 3 zero after decimals
Expected is
1900-01-01 00:00:00.000
Can someone help please
It seems that when milliseconds are .000
spark by default skips that part:
>>> spark.sql("SELECT CAST('1900-01-01 00:00:00.000' AS TIMESTAMP)").show(truncate=False)
+------------------------------------------+
|CAST(1900-01-01 00:00:00.000 AS TIMESTAMP)|
+------------------------------------------+
|1900-01-01 00:00:00 |
+------------------------------------------+
If you set milliseconds to, for example, .999
, it'll be displayed:
>>> spark.sql("SELECT CAST('1900-01-01 00:00:00.999' AS TIMESTAMP)").show(truncate=False)
+------------------------------------------+
|CAST(1900-01-01 00:00:00.999 AS TIMESTAMP)|
+------------------------------------------+
|1900-01-01 00:00:00.999 |
+------------------------------------------+
If you really need that part to be displayed, use date_format:
>>> df = spark.sql("SELECT CAST('1900-01-01 00:00:00.000' AS TIMESTAMP) datecol")
>>> df.select(F.date_format("datecol", "yyyy-MM-dd HH:mm:ss.SSS").alias("datecol")).show(truncate=False)
+-----------------------+
|datecol |
+-----------------------+
|1900-01-01 00:00:00.000|
+-----------------------+