Search code examples
pythonrocksdb

A problem about Rocksdb deleting data but after that iterator still iterate old data


I made an experiment to figure out a rocksdb problem while developing a system using pyrocksdb. I tried the code below:

def func(iterator):
     for k, v in iterator:
             print("k:{}, v:{}".format(k, v))

import rocksdb
db = rocksdb.DB("test.db", rocksdb.Options(create_if_missing=True))
batch = rocksdb.WriteBatch()
batch.put(b'a1', b'data1')
batch.put(b'a2', b'data2')
batch.put(b'a3', b'data3')
db.write(batch)
it = db.iteritems()
it.seek_to_first()
func(it)

#print info
k:b'a1', v:b'data1'
k:b'a2', v:b'data2'
k:b'a3', v:b'data3'

#delete a kv
db.delete(b'a1')
it.seek_to_first()
func(it)

#print info, k:b'a1', v:b'data1' is still available
k:b'a1', v:b'data1'
k:b'a2', v:b'data2'
k:b'a3', v:b'data3'

#but if I reassign it
it = db.iteritems()
it.seek_to_first()
func(it)

#print info, delete takes effect
k:b'a2', v:b'data2'
k:b'a3', v:b'data3'

I want to figure out why I need to reassign the iterator?


Solution

  • Reason:

    • First called db.iteritems() and the returned value is stored in it.
    • Even after you have deleted using db.delete(b'a1'), the information stored in it doesn't updated. It will be same as before.
    • But to check, again you need to update the information of it variable (which you have done by in your last case) by doing it = db.iteritems().
    • Now it variable is updated, and you can see the data is deleted as well.

    Solution:

    Create a function to do that repetitive task and you can call it, whenever you want to print the data.

    def update_func():
        it = db.iteritems()
        it.seek_to_first()
        func(it)
    
    update_func()