Search code examples
performancegoogle-cloud-platformgoogle-cloud-bigtable

How to design Bigtable Key


I would create my optimal key in Bigtable. I know the key design is crucial for query speed and optimization. My case is related to a time series information from a network machine. It is a tall and narrow table with 3 columns: id, info and datetime.

My most frequent query is to get all info for each id for each day.

How should the key be designed to obtain the best performances? id#date?


Solution

  • Disclosure: I lead product management for Google Cloud Bigtable.

    My case is related to a time series information from a network machine. it is a tall and narrow table with 3 columns: id, info and datetime.

    Given that the id is in the row key, I am not sure if you need a separate id column.

    Similarly, can you please clarify why you need to have datetime as a separate column? Note that each value in Cloud Bigtable has an associated timestamp, so you don't need to store a separate date/time in a separate column.

    My most frequent query is to get all info for each id for each day.

    How should the key be designed to obtain the best performances? id#date?

    My recommendation would be to do as you suggested: id#date as the row key, and store all the data for that date within a single row, using the timestamp of each cell value to differentiate it, so that you can get the exact timestamp of each reading.

    As per above, I think you can drop both the id and datetime columns, and accomplish this use case with just a single column for the table.

    Best of luck with your project; please let us know how it goes!