I am using Postgres 9.4.5 and I detected corruption in one of my tables. I noticed this when running queries on a specific table caused the entire database to go into recovery mode. The symptoms lined up with those found in this article:
https://www.endpoint.com/blog/2010/06/01/tracking-down-database-corruption-with
I tried following the steps to zero out the corrupt block but after following the steps, I got a ctid of 507578.
database=# \set FETCH_COUNT 1
database=# \pset pager off
Pager usage is off.
database=# SELECT ctid, left(coded_element_key::text, 20) FROM coded_element WHERE ctid >= '(507577,1)';
ctid | left
------------+----------
(507577,1) | 30010491
(507577,2) | 30010507
(507577,3) | 30010552
(507577,4) | 30010556
(507577,5) | 30010559
(507577,6) | 30010564
(507577,7) | 30010565
(507577,8) | 30010625
...
...
...
(507578,26) | 0A1717281.0002L270&.
(507578,27) | L270&.*)0000.0000000
(507578,28) | 30011452
(507578,29) | -L0092917\x10)*(0117001
(507578,30) | 0.00003840\x10)*)300114
ERROR: invalid memory alloc request size 1908473862
The problem is that when I went to my /data/base directory and found the corresponding file for my table, the file was only 1073741824 bytes. With a block size of 8192 bytes this only gives me a block count of 131072, way under the 507578 value where the supposed corruption is. Is this the correct way to determine the block offset or is there a different way?
PostgreSQL stores the table data in files of 1GB in size.
Assuming that the result of
SELECT relfilenode
FROM pg_class
WHERE relname = 'coded_element';
is 12345, those files would be called 12345
, 12345.1
, 12345.2
and so on.
Since each of these 1GB segments contains 131072 blocks, block 507578 is actually block 114362 in 12345.4
.
The data corruption is either in that block or the following one.
At this point, make sure you have a backup the complete data directory.
To zero out block 507578, you can use
dd if=/dev/zero of=12345.4 bs=8192 seek=114362 count=1 conv=notrunc,nocreat,fsync
If that doesn't do the trick, try the next block.
To salvage data from the block before zeroing it, you can use the pageinspect
extension.