Background
Astyanax's Entity Persister saves a Map of an Entity in multiple columns. The format is mapVariable.key
The Problem:
The astyanax's Entity Persister doesn't remove deleted key/value pairs from cassandra when a Map in an Entity has been updated
The Solution I'm Using Now (bad approach)
I'm deleting the whole row, and then reinsert it
Some More Info
I persist my java objects in cassandra using astyanax's Entity Persister (com.netflix.astyanax.entitystore).
What I've noticed is that when an Entity's Map is persisted with, say, 2 values: testkey:testvalue & testkey2:testvalue2, and the next time the same Entity's Map is persisted with one value (one key/value pair was removed): testkey:testvalue, the testkey2:testvalue2 isn't deleted from the column family.
So, as a work-around, I need to delete the whole row and then reinsert it.
My insertion code:
final EntityManager<T, String> entityManager = new DefaultEntityManager.Builder<T, String>()
.withEntityType(clazz)
.withKeyspace(getKeyspace())
.withColumnFamily(columnFamily)
.build();
entityManager.put(entity);
What am I missing? This is really inefficient and I think astyanax's entity persister is supposed to take care of this on its own.
Any thoughts?
You are not missing anything.
What happens is the following: 1. Astyanax creates a list of ColumnMappers one for each field of the entity under serialization. 2. Then, ColumnMappers take turns populating mutation batch. 3. For maps, MapColumnMapper is used. If you take a look at its code, you will see that it just adds key:value pairs to mutation batch. 4. When data is put in a row in cassandra, new columns from the batch are added, existing ones are overwritten, old ones unfortunately remain the same.
One solution here would be to write a custom serializer for your map and save it in one field.