Search code examples
nosqlcassandrarediskey-valueriak

Suggest a database for key with multiple values , highly scalable


We have data with key-multipleValues. Each key can have around 500 values (each value will be around 200-300 chars) and the number of such keys will be around 10 million. Major operation is to check for a value given a key.

I've been using mysql for long time where i've got 2 options: one row for each keyvalue, one row for each key with all values in a text field.But these does not seem efficient to me as the first model has lot of rows,redundancies and second model text field will become very large .

I am considering using nosql database for this purpose, i've used mongodb before and i dont think it is suitable for my current case. keyvalue based or column family based nosql db would be better.It need not be distributed.Someone who used riak,redis,cassandra etc pls share your thoughts.

Thanks


Solution

  • From your description, it seems some sort of Key-value store will be better for you comparing relational DB.

    The data itself seem to be a non-relational, why store in a relational storage? It seems valid to use something like Cassandra.

    I think a typical data-structure for this data to store will be a column family, with Key as Row-key and Columns as value.

    MyDATA: (ColumnFamily)
      RowKey=>Key
        Column1=>val1
        Column2=>val2
        ...
        ...
        ColumnN=valN
    

    The data would look like (JSON notation):

    MyDATA (CF){
        [
          {key1:[{val1-1:'', timestamp}, {val1-2:'', timestamp}, .., {val1-500:'', timestamp}]},
          {key2:[{val2-1:'', timestamp}, {val2-2:'', timestamp}, .., {val2-500:'', timestamp}]},
          ...
          ...
        ]
    }
    

    Hopefully this helps.