I am trying to learn how to implement a feed in cassandra (think twitter). I want to use wide rows to store all the posts made by a user. I am thinking about adding user information or statistical information in the same row (num of posts, last post date, user name, etc.).
My question is: is name, age, etc. "field name" stored in column? Or those wide rows only store the column-name and values specified? Am I wasting disk space? Am I compromising performance somehow?
Thanks!
-- TABLE CREATION
CREATE TABLE user_feed (
owner_id int,
name text,
age int,
posted_at timestamp,
post_text text,
PRIMARY KEY (owner_id, posted_at)
);
-- INSERTING THE USER
insert into user_feed (owner_id, name, age, posted_at) values (1, 'marc', 36, 0);
-- INSERTING USER POSTS
insert into user_feed (owner_id, posted_at, post_text) values (1, dateof(now()), 'first post!');
insert into user_feed (owner_id, posted_at, post_text) values (1, dateof(now()), 'hello there');
insert into user_feed (owner_id, posted_at, post_text) values (1, dateof(now()), 'i am kind of happy');
-- GETTING THE FEED
select * from user_feed where owner_id=1 and posted_at>0;
-- RESULT
owner_id | posted_at | age | name | post_text
----------+--------------------------+------+------+--------------------
1 | 2014-07-04 12:01:23+0000 | null | null | first post!
1 | 2014-07-04 12:01:23+0000 | null | null | hello there
1 | 2014-07-04 12:01:23+0000 | null | null | i am kind of happy
-- GETTING USER INFO - ONLY USER INFO IS POSTED_AT=0
select * from user_feed where owner_id=1 and posted_at=0;
-- RESULT
owner_id | posted_at | age | name | post_text
----------+--------------------------+------+------+--------------------
1 | 1970-01-01 00:00:00+0000 | 36 | marc | null
What about making them static?
A static column is the same in all partition key and since your partition key is the id of the owner you could avoid wasting space and retrieve the user informations in any query.
CREATE TABLE user_feed (
owner_id int,
name text static,
age int static,
posted_at timestamp,
post_text text,
PRIMARY KEY (owner_id, posted_at)
);
Cheers, Carlo