I am working on a Project in which I need to delete all the columns and its data except for one column and its data in Cassandra using Astyanax client.
I have a dynamic column family like below and we already have couple of million records into that Column Family.
create column family USER_TEST
with key_validation_class = 'UTF8Type'
and comparator = 'UTF8Type'
and default_validation_class = 'UTF8Type'
and gc_grace = 86400
and column_metadata = [ {column_name : 'lmd', validation_class : DateType}];
I have user_id
as the rowKey and other columns I have is something like this -
Now I need to delete all the columns and its data except for a15
column. Meaning, I want to keep a15
column and its data for all the user_id(rowKey)
and delete rest of the columns and its data..
I already know how to delete data from Cassandra using Astyanax client for a particular rowKey
public void deleteRecord(final String rowKey) {
try {
MutationBatch m = AstyanaxConnection.getInstance().getKeyspace().prepareMutationBatch();
m.withRow(AstyanaxConnection.getInstance().getEmp_cf(), rowKey).delete();
} catch (ConnectionException e) {
// some code
} catch (Exception e) {
// some code
Now how to delete all the columns and its data except for one column for all the users id which is my rowKey...
Any thoughts how this can be done using Astyanax client efficiently?
It appears that Astyanax does not currently support the slice delete functionality that is a fairly recent addition to both the storage engine and the Thrift API. If you look at the thrift API reference: http://wiki.apache.org/cassandra/API10 You see that the delete operation takes a SlicePredicate, which can take either a list of columns or a SliceRange. A SliceRange, could specify all columns greater or less than the column you wanted to keep, so that would allow you to do two slice delete operations to delete all but one of the columns in the row.
Unfortunately, Astyanax only has the ability to delete an entire row, or a defined list of columns and doesn't wrap the full SlicePredicate functionality. So it looks like you have two options: 1) See about sending a raw thrift slice delete, bypassing Astyanax wrapper, or 2) Do a column read, followed by a row delete, followed by a column write. This is not ideally efficient, but if it isn't done too frequently shouldn't be prohibitive. or 3) Read the entire row and explicitly delete all of the columns other than the one you want to preserve.
I should note that while the storage engine and thrift API both support slice deletes, this is also not yet explicitly supported by CQL.
I filed this ticket to address that last limitation: https://issues.apache.org/jira/browse/CASSANDRA-6292