I have Cassandra model as
import uuid
from cassandra.cqlengine import columns
from cassandra.cqlengine.models import Model
class MyModel(Model):
...
...
created_at = columns.TimeUUID(primary_key=True,
clustering_order='DESC',
default=uuid.uuid1)
...
...
Recentrly app hit the uuid1 creation doesn't close files - hits file descriptor limit. I try to find the solution, but seems what options I think might be not work
uuid1
in default with uuid4
, but TimeUUID
need time part in it, and only uuid1
provide that.uuid1
with cassandra.util.uuid_from_time(time.time())
, when check the code for both uuid1
and uuid_from_time
, both are looks same, so that also not solve the problem.Last option is to replace TimeUUID
with Timestamp
type, but this created_at
column is primary_key
and clustering_order
, so dont know I can do that or not.
My column family has already 1,000,000+ data, so I cant just drop them.
I also want to know, what is the advantage of using TimeUUID
instead of timestamp
?
Are you certain you're hitting the libuuid
issue you linked? Your code snippet shows the standard library uuid
, which probably doesn't have that issue. Is it possible there's a different file descriptor leak in your program?
If it is libuuid
, the easiest course would be to use the standard library implementation. If speed is a major concern for you, you might look into building a different version of libuuid
to use with python-libuuid
. I tried this one quickly and didn't notice any file descriptors leaking: http://www.ossp.org/pkg/lib/uuid/
I also want to know, what is the advantage of using TimeUUID instead of timestamp ?
You won't be able to change the type of the column on your existing table, but to answer your question: TimeUUID is usually used to avoid collisions where multiple events could be written in the same timestamp value.