Folks, I would like to solve the following with one table in Cassandra. Said service tracks when users open an asset. On subsequent events to the same asset, we simply over-write the accessDate.
example record:
{ userId: "string", assetId: "string", accessDate: unixTimestamp }
With this said, we need to fulfill the following access requirements (each requirement has its own bulletpoint for readability):
This is easy to achieve, table could look like:
CREATE TABLE user_assets_tracker (
userId uuid,
accessDate timestamp,
assetId uuid,
PRIMARY KEY (userid, accessDate, assetId)
);
This allows us to query for all assets, and when each was last accessed.
SELECT *
FROM user_assets_tracker
WHERE userId = 522b1fe2-2e36-4cef-a667-cd4237d08b89
ORDER BY accessDate DESC;
>
Dandy. Now the harder bits, which I am unsure about, was hoping you folks could chime in:
Naturally the LIMIT here is not what we need. Also, we may need to have 2 tables to achieve this.
SELECT *
FROM user_assets_tracker
WHERE userid = 522b1fe2-2e36-4cef-a667-cd4237d08b89
ORDER BY accessDate DESC;
LIMIT 10; ?????
This is probably straight forward, with this schema:
CREATE TABLE user_assets_tracker (
userId uuid,
accessDate timestamp,
assetId uuid,
PRIMARY KEY (userid, accessDate, assetId)
);
SELECT *
FROM user_assets_tracker
WHERE userid = 522b1fe2-2e36-4cef-a667-cd4237d08b89
ORDER BY accessDate DESC;
LIMIT 1;
Since accessDate comes before assetId in our schema, I am not sure how to do this as well. Another table?
Thanks!!
PS It seems that SASI Index could be the solution
Though you are always selecting assetid orderby accessDate desc.
Define your schema with order by accessDate desc
CREATE TABLE user_assets_tracker (
userid uuid,
accessdate timestamp,
assetid uuid,
PRIMARY KEY (userid, accessdate, assetid)
) WITH CLUSTERING ORDER BY (accessdate DESC, assetid ASC);
Now you don't need to specify order by accessDate desc every time. it will by default order your data by accessDate desc
First get timestamp of 30 day ago.
Let's current timestamp of 30 day ago is : 2017-02-05 12:00:00+0000
Now you can query :
SELECT * FROM user_assets_tracker WHERE userid = 522b1fe2-2e36-4cef-a667-cd4237d08b89 AND accessdate >= '2017-02-05 12:00:00+0000'
If you are using Cassandra 3.0 or above you can use Materialized Views
CREATE a Materialized View :
CREATE MATERIALIZED VIEW user_assets AS
SELECT *
FROM user_assets_tracker
WHERE userid IS NOT NULL AND assetid IS NOT NULL AND accessdate IS NOT NULL
PRIMARY KEY (userid, assetid, accessdate);
Now if you want to get all data with userid and assetid, here is the query
SELECT * FROM user_assets WHERE userid = 522b1fe2-2e36-4cef-a667-cd4237d08b89 AND assetid = 1d45e6c2-02a1-11e7-aac5-b9ab92bee74c;
Here is another thing, if huge data is inserted into a single user, you should add time bucket with userid as partition key.For more check the answer https://stackoverflow.com/a/41857183/2320144