I'm trying to learn how sharding is configured in Postgres.
My Postgres setup has a temperature
table which has 4 partitions each covering different range of "timestamp" value.
postgres=# \d+ temperature
Partitioned table "public.temperature"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-----------+-----------------------------+-----------+----------+-----------------------------------------+---------+--------------+-------------
id | bigint | | not null | nextval('temperature_id_seq'::regclass) | plain | |
city_id | integer | | not null | | plain | |
timestamp | timestamp without time zone | | not null | | plain | |
temp | numeric(5,2) | | not null | | main | |
Partition key: RANGE ("timestamp")
Partitions: temperature_201901 FOR VALUES FROM ('2019-01-01 00:00:00') TO ('2019-02-01 00:00:00'),
temperature_201902 FOR VALUES FROM ('2019-02-01 00:00:00') TO ('2019-03-01 00:00:00'),
temperature_201903 FOR VALUES FROM ('2019-03-01 00:00:00') TO ('2019-04-01 00:00:00'),
temperature_201904 FOR VALUES FROM ('2019-04-01 00:00:00') TO ('2019-05-01 00:00:00')
temperature_201904
table, in particular, is a foreign table
postgres=# \d+ temperature_201904
Foreign table "public.temperature_201904"
Column | Type | Collation | Nullable | Default | FDW options | Storage | Stats target | Description
-----------+-----------------------------+-----------+----------+-----------------------------------------+-------------+---------+--------------+-------------
id | bigint | | not null | nextval('temperature_id_seq'::regclass) | | plain | |
city_id | integer | | not null | | | plain | |
timestamp | timestamp without time zone | | not null | | | plain | |
temp | numeric(5,2) | | not null | | | main | |
Partition of: temperature FOR VALUES FROM ('2019-04-01 00:00:00') TO ('2019-05-01 00:00:00')
Partition constraint: (("timestamp" IS NOT NULL) AND ("timestamp" >= '2019-04-01 00:00:00'::timestamp without time zone) AND ("timestamp" < '2019-05-01 00:00:00'::timestamp without time zone))
Server: shard02
Insert works as expected. If I insert the following value and check from the remote host shard02
, then the value exists. Fantastic!
postgres=# select * from temperature_201904;
id | city_id | timestamp | temp
----+---------+---------------------+-------
1 | 1 | 2019-04-02 00:00:00 | 12.30
(1 row)
However, if I update the timestamp of this row such that it's no longer valid for the range defined for the partition, I'd expect it to get moved out and placed into the correct partition, temperature_201901
, but it's not.
postgres=# update temperature set timestamp = '2019-01-04' where id=1;
UPDATE 1
postgres=# select * from temperature_201904 ;
id | city_id | timestamp | temp
----+---------+---------------------+-------
1 | 1 | 2019-01-04 00:00:00 | 12.30
Again, just to reiterate, this table has a range temperature_201904 FOR VALUES FROM ('2019-04-01 00:00:00') TO ('2019-05-01 00:00:00')
and is a foreign table.
Feels like I'm missing something here.
Is this an expected behavior? If so, is there a way to configure such that data are automatically moved between nodes as their partition constraints are changed?
Thanks in advance!
postgres=# SELECT version();
version
------------------------------------------------------------------------------------------------------------------
PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
This seems to be expected. From the docs
While rows can be moved from local partitions to a foreign-table partition (provided the foreign data wrapper supports tuple routing), they cannot be moved from a foreign-table partition to another partition.
Now I would have expected an ERROR rather than silently violating the implied constraint, but I wouldn't expect this to have worked the way to you want it to.