Search code examples
filecassandrahector

How to store small files in Cassandra?


I have a few thousand individual html files stored locally in a directory, they are at most a few KBytes each.

I want to store them in a single node of Cassandra, how would I go about doing that programmatically with Hector? What APIs do I use to setup the columnFamily to handle static files and how should I setup the schema? Thanks!


Solution

  • I want to store them in a single node of Cassandra, how would I go about doing that programmatically with Hector?

    You need to create a keyspace in cassandra. You can either create it in your cluster node using cassandra-cli with command create keyspace or using hector method addKeyspace().

    What APIs do I use to setup the columnFamily to handle static files and how should I setup the schema?

    You can use BasicColumnDefinition to create column family in cassandra. Take a look sample code here on how to add a column family in a keyspace. You will probably have a column family called html_doc with a column name of your html filename and value of type AsciiType or UTF8Type (or the default BytesType). The html document will need to be read in standard java way and you can see how to insert value for a column here.