Search code examples
indexingtime-seriesinfluxdb

DELETE - what does it practically mean that series are not dropped from the index


We have a case in which we consider deleting single points from series with DELETE FROM measurement WHERE Tag1='val1' .... and time='...'.

We can see that this works fine and the points are no longer retrievable after delete. However, one thing bothers us, and namely the note in the documentation "it does not drop the series from the index".

I understand that if we drop a whole series with DELETE, then the index won't be updated and nevertheless at the end this will just take up some space and not slow down the engine in any way. Is that correct?

However, I'm not sure how to understand this in our very case when we delete single points from different series. Are there any hidden pitfalls in this approach? What will eventually happen with the indexes? Is the engine going to be slowed down at some point in future?

Any comments, suggestions and advices are highly appreciated!


Solution

  • I believe what it will do is mark the item as deleted without actually deleting it. So it still consumes disk but won't show in results. They call this a "tombstone" This is different from deleting an entire database, rp or measurement which will actually remove the item.

    After a given time when InfluxDB runs its background compaction process, it will clean up the tombstone. The ideal way to remove data is to just let the retention policy take effect and remove it automatically