Search code examples
pythonrocksdb

Python leaks memory if I read a tuple instead of single value


I'm querying whether a key exists in a RocksDB database using Python. The API (see bottom) implies it returns a two-element tuple. So I receive both tuple elements:

def contains_key(database: rocksdb, key: int) -> bool:
    found, data = database.key_may_exist(key)
    return found

and I am using it like this:

if not contains_key(database, key):

However, this causes a memory leak.

To prove this, if I change the code to:

def contains_key(database: rocksdb, key: int) -> bool:
    return database.key_may_exist(key)

there is no leak (but it's obviously not correct).

How do I get the first version to work without a memory leak?

enter image description here


Solution

  • Your code

    def contains_key(database: rocksdb, key: int) -> bool:
        found, data = database.key_may_exist(key)
        return found
    

    does not contain a memory leak.

    found and data are two names in the local scope, and both refer to values that exist in a single area of memory. When the function returns, found and data go out of scope. What found referred to can be referred to by the caller (x = contains_key(db, 3)), but data is ignored. Since no other reference to that object exists, Python (as a garbage-collecting language) reclaims whatever resources were devoted to data automatically.

    For that matter, the key_may_exist method returns a tuple that you never safe a reference to. It is immediately unpacked, with found and data referring to whatever that tuple referred to. The tuple itself is garbage-collected, but the references to its contents remain.