Search code examples
postgresqltimestamptimestamp-with-timezone

In Postgres, how to generate a UTC timestamp with timezone column from a text column with YYYY-MM-DDTHH:MM:SSZ dates


My table has a "Timestamp" column (text type) with YYYY-MM-DDTHH:MM:SSZ formatted dates. I want to generate a timestamptz formatted column with a continuous UTC timestamp but have been unable to do it. I have tried many methods suggested in forums and documentation but I have not been able to get anything to work.

Here is a data example from the table:

select "Timestamp",("Timestamp"::timestamp with time zone) from public.time_177168 limit 1

This returns:

"2022-12-10T04:10:02-06:00" (Text) and "2022-12-10 10:10:02+00" (timestamp with time zone)

Here are a few examples of my attempts to generate the new column but they all return:

ERROR: generation expression is not immutable SQL state: 42P17

Attempt 1:

alter table public.time_177168 ADD COLUMN "TimestampUTC" timestamp with time zone GENERATED ALWAYS AS ("Timestamp"::timestamp with time zone) STORE

Attempt 2:

alter table public.time_177168 ADD COLUMN "TimestampUTC" timestamp with time zone GENERATED ALWAYS AS ("Timestamp"::timestamp AT TIME ZONE 'ETC/UTC') STORED

The overall goal is to be able to quickly order queries by UTC time. I am not able to change the data type for the existing "Timestamp" column because of legacy applications that use this database.

Any ideas or suggestion would be greatly appreciated.

Additional Information: Using the solution below I was able to get the query performance to an acceptable level.

Original Query:

EXPLAIN ANALYSE SELECT "Timestamp","Column1","Column2","Column3"  FROM time_177168 WHERE "Timestamp">'2022-11-06T00:59:00-06:00' ORDER BY ("Timestamp"::timestamp with time zone) limit 5000

Query Plan:

Limit  (cost=125360.32..125943.69 rows=5000 width=81) (actual time=5826.521..5828.301 rows=5000 loops=1)
  ->  Gather Merge  (cost=125360.32..198037.52 rows=622904 width=81) (actual time=5826.520..5827.743 rows=5000 loops=1)
        Workers Planned: 2
        Workers Launched: 0
        ->  Sort  (cost=124360.29..125138.92 rows=311452 width=81) (actual time=5826.186..5826.712 rows=5000 loops=1)
              Sort Key: ((Timestamp)::timestamp with time zone)
              Sort Method: top-N heapsort  Memory: 1089kB
              ->  Parallel Seq Scan on time_177168  (cost=0.00..103667.87 rows=311452 width=81) (actual time=0.136..5302.325 rows=747701 loops=1)
                    Filter: (Timestamp > '2022-11-06T00:59:00-06:00'::text)
                    Rows Removed by Filter: 438784
Planning Time: 0.145 ms
Execution Time: 5829.070 ms

New Query (Based on Accepted Solution)

EXPLAIN ANALYSE SELECT "Timestamp","Column1","Column2","Column3"  FROM time_177168 WHERE "Timestamp">'2022-11-06T00:59:00-06:00' ORDER BY "TimestampUTC" limit 5000

Query Plan:

Limit  (cost=0.43..2793.20 rows=5000 width=81) (actual time=728.625..748.371 rows=5000 loops=1)
  ->  Index Scan using timestamputc_time_177168 on time_177168 (cost=0.43..417511.91 rows=747486 width=81) (actual time=728.623..747.778 rows=5000 loops=1)
        Filter: (Timestamp > '2022-11-06T00:59:00-06:00'::text)
        Rows Removed by Filter: 438784
Planning Time: 0.134 ms
Execution Time: 756.844 ms

Solution

  • As long as you know the function is truly immutable, you can just declare it as such. So create a function like:

    CREATE FUNCTION str2timestamp(text) RETURNS timestamp with time zone
       IMMUTABLE SET timezone = 'UTC' LANGUAGE sql
    RETURN to_timestamp($1, 'YYYY-MM-DD\THH24:MI:SS);
    

    That is safe, because timezone is fixed while the function is running.

    Such a function can be used to define a generated column using the following steps:

    ALTER TABLE public.time_177168
       ADD "TimestampUTC" timestamp with time zone
          GENERATED ALWAYS AS (str2timestamp("Timestamp")) STORED;