Search code examples
hadoopcassandracql3cassandra-cli

Cassandra/Hadoop WITH COMPACT STORAGE option. Why is it needed, is it possible to add it to existing tables/cf


I'm working on a Hadoop / Cassandra integration I have a couple of questions I was hoping someone could help me with.

First, I seem to require the source table/cf to have been created with the option WITH COMPACT STORAGE otherwise I get an error can't read keyspace in map/reduce code.

I was wondering if this is just how it needs to be?

And if this is the case, my second question was, is it possible/how do I add the WITH COMPACT STORAGE option on to a pre-exsting table? .. or am I going to have to re-create them and move data around.

I am using Cassandra 1.2.6

thanks in advance Gerry


Solution

  • I'm assuming you are using job.setInputFormatClass(ColumnFamilyInputFormat.class);

    Instead, try using job.setInputFormatClass(CqlPagingInputFormat.class);

    The Mapper input for this is Map<String, ByteBuffer>, Map<String,ByteBuffer>

    Similarly, if you need to write out to Cassandra us CqlPagingOutputFormat and the appropriate output types.

    See http://www.datastax.com/dev/blog/cql3-table-support-in-hadoop-pig-and-hive for more info.