i have two Rows on a Cassandra ColumnFamily an want to Compare the Values of Columns with the same Columnname, eg:
CF: User
Key: Columns:
......................................................
Now i want to compare difference K2 Columns With K1 Columns to get this Result in Cassandra:
Key: Columns:
.........................................................................
At first i want to Code this with Hadoop but i see A Problem that i can#t define two Keys for a Map Process?
Haddop was the choice because it must be a scalable solution.
I hope anyone has an tipp for?
BG, Danny
I dont understand by which row the base of substraction will be represented? K1[V1]-K2[V1] or vice versa?
Ok, lets say the row with recent timestamp will be a base.
You Map step should emit the following (K => V):
// each value is a WritableComparable object to allow sorting by timestamp
"Andy" => {"key":K1, "value":100, timestamp1}
"Tom" => {"key":K1, "value":100, timestamp2}
"Andy" => {"key":K2, "value":120, timestamp3}
"Tom" => {"key":K2, "value":90, timestamp4}
Reduce step will receive array of pair, for each values are sorted by the timestamp:
"Andy" => [ {"key":K1, "value":100, timestamp1},
{"key":K2, "value":120, timestamp3} ]
"Tom" => [ {"key":K1, "value":100, timestamp2},
{"key":K2, "value":90, timestamp4} ]
Now in reduce step you can easly perform a substraction and write necessary columns like "diff" to database