Search code examples
cassandraphotosdatamodel

Cassandra data model to store 1.000.000 photos


It's a question to experienced Cassandra users. I'd like to store photos in Cassandra. Data structure is very simple:

UUID : photo_id;
String: filename;
String authorname;

How to store this data in Cassandra ? Use photoId as CF key or store all photos as columns where photoid is a column name ? I need fast iterating and do not need to fast access f.e. authors names.

Regards

Tom


Solution

  • If you are planning to always lookup photos by photo_id, you should essentially treat it as a key value store with photo_id as the key and the image as a column value. The metadata (filename, authorname) can be stored in additional columns within the same row if you generally need these at the same time as the image.

    If your images are very large, consider chunking them into 1mb to 10mb pieces, one column per piece, so that you don't have to fetch them all at once.

    If you also need to occasionally lookup by authorname, use a second CF as an index where the row key is the authorname and the columns are photo_id's. You can then fetch the actual images from the first CF by photo_id.

    It's not clear what you mean by "fast iterating", but if you plan to scan the entire 1m image data set, you can do that quite easily with the first CF I described by using get_range_slices.