Search code examples
lucenecratedb

Index corruption on large table


I have a large table with around 123 million records in cratedb. I noticed that during a snapshot to s3 (or indeed to file system) an index corruption occurs on each shard. Consequently this causes a partial snapshot. Once crate is restarted the table doesn't load on the account that there is a corrupted index. I have to remove the corrupted file and a file lock from the index folder and table heals. I have tried to recreate tables by moving everything to another table and swapping (using alter cluster command) but the corruption still occurs on the new table as well.

Is there anything else I can try to fully snapshot the cluster and avoid corruption?


Solution

  • Crate team found a bug https://github.com/crate/crate/pull/9318 Resolved in 4.0.8