Search code examples
google-cloud-platformbigtablegoogle-cloud-bigtable

Cloud Bigtable doesn't appear to be removing data that should be garbage collected


I am using a Cloud Bigtable development cluster. I changed max version to 1 for a specific column family, but it doesn't seem like it affected my data. When I perform a lookup, old versions still exist. What am I missing?

I run:

#cbt setgcpolicy table column_family maxversions=1

#cbt ls table
Family Name GC Policy
----------- ---------
p       versions() > 1
z       age() > 3d

When I run lookup, I still see the old versions.

cbt lookup 'table' key columns=p:field

Solution

  • Based on what you are showing here, it looks like you set up garbage collection correctly.

    Cloud Bigtable's garbage collection is a continuous process. It can take up to a week from the time the data matches your rule for the data to be deleted. You should filter your read requests to only get the latest version or use whatever criteria your rule specified in order to not fetch data that will eventually be garbage collected.