Search code examples
pythoncassandratime-seriescql3cqlsh

Is it possible to grab all values of a Cassandra composite key?


Say I have:

cur.execute("CREATE TABLE data_by_year ( device_id int, \
        site_id text, year_id int, event_time timestamp, value float, \
        PRIMARY KEY ((device_id, site_id, year_id),event_time))")

And I want to query all devices for years 2014 and 2013.

result=cur.execute("select distinct device_id, site_id, year_id,\
    from data_by_year where device_id IN (324535, 32453l),\
    and site_id in  and year_id IN (2014)")

Obvously this statement has many issues but it's the best example I could come up with. My beef is with the "where device_id IN (324535, 32453l)". In reality I will not know all the various devices so I want to grab them "ALL". How do I do this?

I'm dealing with time series minute data so I felt that one year was a reasonable partition.


Solution

  • knifewine's answer is correct, but if you're going to be executing this query frequently (and want good performance), I suggest using a second table:

    CREATE TABLE all_device_data_by_year (
        site_id text,
        year_id int,
        device_id int,
        event_time timestamp,
        value float,
        PRIMARY KEY ((site_id, year_id), device_id, event_time)
    )
    

    You might want to partition by day/month instead of year, depending on the number of devices.

    Regarding automatic query paging support in the python driver, it's available right now in the 2.0 branch. I should have a 2.0-beta release ready soon.