Search code examples
apache-spark-sqlgoogle-bigqueryunix-timestamp

SparkSQL equivalent in BigQuery unix_timestamp


I am trying to understand the following line and write its equivalent in BigQuery SQL:

from_unixtime(unix_timestamp('2019-01-15', 'yyyy-MM-dd') - 900)

What exactly is this line in SparkSQL doing and what would be the equivalent of this in BigQuery syntax?


Solution

  • From what I can tell it does the following:

    1. converts the string 2019-01-15 to a unix timestamp (# seconds since JAN 01 1970)
    2. Subtracts 900 seconds (15 minutes) from the unix timestamp
    3. Converts Unix time (INT64) to a timestamp data type

    Basically, it subtracts 15 minutes from 2019-01-15. There is a more concise way of doing this with SQL, here is an SQL query that shows them together:

    select 
        PARSE_TIMESTAMP("%Y-%m-%d", '2019-01-15') original_timestamp,
        -- The equivalent expression to the Spark one you have
        TIMESTAMP_SECONDS(UNIX_SECONDS(PARSE_TIMESTAMP("%Y-%m-%d", '2019-01-15')) - 900) original_expression,
        -- Use this instead which does the same thing
        TIMESTAMP_ADD(PARSE_TIMESTAMP("%Y-%m-%d", '2019-01-15'), interval - 15 minute) more_concise
    

    Results: enter image description here