Search code examples
postgresqldatetimetimestampquery-performance

postgres NOW() function taking too long vs string equivalent


this is my first question on StackOverflow so forgive if the question may not be properly structured.

I have a table t_table with datetime column d_datetime and I need to filter data between the past 5 days. Both of the following work locally where I have less data:

query 1.

SELECT * FROM t_table 
WHERE d_datetime 
BETWEEN '2020-08-28T00:00:00.024Z' AND '2020-09-02T00:00:00.024Z';

query 2.

SELECT * FROM t_table 
WHERE d_datetime 
BETWEEN (NOW() - INTERVAL '5 days') AND NOW();

query 3.

SELECT * FROM t_table 
WHERE d_datetime > NOW() - INTERVAL '5 days'; 

However, when I move to the live database, only the first query runs to completion in about 10 seconds. I cannot tell why but the other two just stick consuming too much processing power and I haven't once seen them run to completion, even after waiting upto 5 minutes on end.

I have tried automatically generating the strings used for the d_datetime shown in the first query using:

query 4.

SELECT * FROM t_table 
WHERE d_datetime 
BETWEEN 
(TO_CHAR(NOW() - INTERVAL '5 days', 'YYYY-MM-ddThh:MI:SS.024Z'))
 AND
(TO_CHAR(NOW(), 'YYYY-MM-ddThh:MI:SS.024Z'))

but it throws the following error:

operator does not exist: timestamp without time zone >= text

My questions are:

  1. Is there any particular reason why query 1 so fast and the rest take an extremely longer period of time to run on a large dataset?
  2. Why does query 4 fail when it practically generates the same string format as query 1 ('YYYY-MM-ddThh:mm:ss.024Z')?

The following is the result of the explain result to the first query

EXPLAIN SELECT * FROM t_table 
WHERE d_datetime 
BETWEEN '2020-08-28T00:00:00.024Z' AND '2020-09-02T00:00:00.024Z';
                                                                                                                                              QUERY PLAN                                                                                                                                               
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize HashAggregate  (cost=31346.37..31788.13 rows=35341 width=22) (actual time=388.622..388.845 rows=6 loops=1)
   Output: count(_hyper_12_67688_chunk.octets), _hyper_12_67688_chunk.application, (date_trunc('day'::text, _hyper_12_67688_chunk.entry_time))
   Group Key: (date_trunc('day'::text, _hyper_12_67688_chunk.entry_time)), _hyper_12_67688_chunk.application
   Buffers: shared hit=17193
   ->  Gather  (cost=27105.45..31081.31 rows=35341 width=22) (actual time=377.109..398.285 rows=11 loops=1)
         Output: _hyper_12_67688_chunk.application, (date_trunc('day'::text, _hyper_12_67688_chunk.entry_time)), (PARTIAL count(_hyper_12_67688_chunk.octets))
         Workers Planned: 1
         Workers Launched: 1
         Buffers: shared hit=17193
         ->  Partial HashAggregate  (cost=26105.45..26547.21 rows=35341 width=22) (actual time=174.272..174.535 rows=6 loops=2)
               Output: _hyper_12_67688_chunk.application, (date_trunc('day'::text, _hyper_12_67688_chunk.entry_time)), PARTIAL count(_hyper_12_67688_chunk.octets)
               Group Key: date_trunc('day'::text, _hyper_12_67688_chunk.entry_time), _hyper_12_67688_chunk.application
               Buffers: shared hit=17193
               Worker 0: actual time=27.942..28.206 rows=5 loops=1
                 Buffers: shared hit=579
               ->  Result  (cost=1.73..25272.75 rows=111027 width=18) (actual time=0.805..141.094 rows=94662 loops=2)
                     Output: _hyper_12_67688_chunk.application, date_trunc('day'::text, _hyper_12_67688_chunk.entry_time), _hyper_12_67688_chunk.octets
                     Buffers: shared hit=17193
                     Worker 0: actual time=1.576..23.928 rows=6667 loops=1
                       Buffers: shared hit=579
                     ->  Parallel Append  (cost=1.73..23884.91 rows=111027 width=18) (actual time=0.800..114.488 rows=94662 loops=2)
                           Buffers: shared hit=17193
                           Worker 0: actual time=1.572..20.204 rows=6667 loops=1
                             Buffers: shared hit=579
                           ->  Parallel Bitmap Heap Scan on _timescaledb_internal._hyper_12_67688_chunk  (cost=1.73..11.23 rows=8 width=17) (actual time=1.570..1.618 rows=16 loops=1)
                                 Output: _hyper_12_67688_chunk.octets, _hyper_12_67688_chunk.application, _hyper_12_67688_chunk.entry_time
                                 Recheck Cond: ((_hyper_12_67688_chunk.entry_time >= '2020-08-28 05:45:03.024'::timestamp without time zone) AND (_hyper_12_67688_chunk.entry_time <= '2020-09-02 11:45:03.024'::timestamp without time zone))
                                 Filter: ((_hyper_12_67688_chunk.application)::text = 'dns'::text)
                                 Rows Removed by Filter: 32
                                 Buffers: shared hit=11
                                 Worker 0: actual time=1.570..1.618 rows=16 loops=1
                                   Buffers: shared hit=11
                                 ->  Bitmap Index Scan on _hyper_12_67688_chunk_dpi_applications_entry_time_idx  (cost=0.00..1.73 rows=48 width=0) (actual time=1.538..1.538 rows=48 loops=1)
                                       Index Cond: ((_hyper_12_67688_chunk.entry_time >= '2020-08-28 05:45:03.024'::timestamp without time zone) AND (_hyper_12_67688_chunk.entry_time <= '2020-09-02 11:45:03.024'::timestamp without time zone))
                                       Buffers: shared hit=2
                                       Worker 0: actual time=1.538..1.538 rows=48 loops=1
                                         Buffers: shared hit=2
                           ->  Parallel Index Scan Backward using _hyper_12_64752_chunk_dpi_applications_entry_time_idx on _timescaledb_internal._hyper_12_64752_chunk  (cost=0.14..2.36 rows=1 width=44) (actual time=0.040..0.076 rows=52 loops=1)
                                 Output: _hyper_12_64752_chunk.octets, _hyper_12_64752_chunk.application, _hyper_12_64752_chunk.entry_time
                                 Index Cond: ((_hyper_12_64752_chunk.entry_time >= '2020-08-28 05:45:03.024'::timestamp without time zone) AND (_hyper_12_64752_chunk.entry_time <= '2020-09-02 11:45:03.024'::timestamp without time zone))
                                 Filter: ((_hyper_12_64752_chunk.application)::text = 'dns'::text)
                                 Rows Removed by Filter: 52
                                 Buffers: shared hit=

-- cut logs
                           ->  Parallel Seq Scan on _timescaledb_internal._hyper_12_64814_chunk  (cost=0.00..2.56 rows=14 width=17) (actual time=0.017..0.038 rows=32 loops=1)
                                 Output: _hyper_12_64814_chunk.octets, _hyper_12_64814_chunk.application, _hyper_12_64814_chunk.entry_time
                                 Filter: ((_hyper_12_64814_chunk.entry_time >= '2020-08-28 05:45:03.024'::timestamp without time zone) AND (_hyper_12_64814_chunk.entry_time <= '2020-09-02 11:45:03.024'::timestamp without time zone) AND ((_hyper_12_64814_chunk.application)::text = 'dns'::text))
                                 Rows Removed by Filter: 40
                                 Buffers: shared hit=2
                           ->  Parallel Seq Scan on _timescaledb_internal._hyper_12_62262_chunk  (cost=0.00..2.54 rows=9 width=19) (actual time=0.027..0.039 rows=15 loops=1)
                                 Output: _hyper_12_62262_chunk.octets, _hyper_12_62262_chunk.application, _hyper_12_62262_chunk.entry_time
                                 Filter: ((_hyper_12_62262_chunk.entry_time >= '2020-08-28 05:45:03.024'::timestamp without time zone) AND (_hyper_12_62262_chunk.entry_time <= '2020-09-02 11:45:03.024'::timestamp without time zone) AND ((_hyper_12_62262_chunk.application)::text = 'dns'::text))
                                 Rows Removed by Filter: 37
                                 Buffers: shared hit=2
 Planning Time: 3367.445 ms
 Execution Time: 417.245 ms
(7059 rows)

The Parallel Index Scan Backward using... log continues for all hypertable chunks in the table.

For the other three queries that have previously been mentioned to be unsuccessful, they are still not completing when queried and just end up eventually filling the memory. Thus I cannot post the EXPLAIN results of these queries, sorry.

Please let me know if my question has not been properly structured. Thanks.


Solution

  • You are using a partitioned table that likely has a lot of partitions, because the planning time for the query takes 3 seconds.

    You are probably using PostgreSQL v11 or earlier. v12 introduced partition pruning at execution time, while v11 can only exclude partitions at query planning time.

    In your first query the WHERE condition contains constants, so that works. In the other queries, the function now() is used, whose result value is only known at query execution time (it is STABLE, not IMMUTABLE), so partition pruning cannot take place at query planning time. Query planning and execution need not happen at the same time – think of prepared statements.