Search code examples
postgresqlautovacuum

Postgres n_tup_ins vs row count mismatch for a table which was never analyzed


I'm trying to figure out why no analyze job was ever performed on a table with more than 1 million rows. If I run the following query:

select count(*) from variant

I get 1,668,422

However when I retrieve the statistics on that same table:

SELECT relname,
n_tup_ins, n_tup_upd, n_tup_del,
n_live_tup, n_dead_tup,
last_vacuum, last_autoanalyze
FROM pg_stat_user_tables WHERE relname = 'variant';

          relname           | n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup | last_vacuum | last_autoanalyze
----------------------------+-----------+-----------+-----------+------------+------------+-------------+--------------
 variant_analysis_dependent |    140514 |       530 |         0 |     140514 |        104 |             | 
(1 row)

We can see that no analysis was ever performed. Then my expectation would be for n_tup_ins to be equal or greater than the amount of rows from the first query (and n_live_tup to be equal). Also n_tup_ins - n_dead_tup should be equal to n_live_tup

I have the following values for the settings:

SELECT name, setting, short_desc from pg_settings where category like 'Autovacuum';
                name                 |  setting  |                                        short_desc                                         
-------------------------------------+-----------+-------------------------------------------------------------------------------------------
 autovacuum                          | on        | Starts the autovacuum subprocess.
 autovacuum_analyze_scale_factor     | 0.1       | Number of tuple inserts, updates, or deletes prior to analyze as a fraction of reltuples.
 autovacuum_analyze_threshold        | 50        | Minimum number of tuple inserts, updates, or deletes prior to analyze.

So even with the tuple counts from the statistics, if we apply the formula autovacuum_analyze_scale_factor * number of tuples + autovacuum_analyze_threshold, we should get 0.1 * 140514 + 50 = 14101.4, less than n_tup_ins = 140514, so the analyze job should be triggered, right?

What am I missing here?


Solution

  • The 'number of tuples' to be used in your formula comes from pg_class, not from pg_stat_user_tables. So this observation is easy to explain if your server has crashed, or had pg_stat_reset() or kin invoked on it at the "right/wrong" time.