Search code examples
database-designhbase

HBase Table Model


I'm learning how to use HBase. I need to put in the database each trip of several cars (by points geolocated (x,y)). These data come in a JSON Format.

The problem is that the number of points geolocated during the trip change for each document that I recover. (Each trip is different.)

How can I store these data in HBase?

Do I have to change the number of columns for each row inserted?

  • Trip1 : x1,y1,x2,y2,x3,y3
  • Trip2 : x1,y1,x2,y2,x3,y3,x4,y4

Or Do I need to keep only 2 columns, one for all x and one for all y?

  • Trip1 : (X,Y)
  • Trip2 : (X,Y)

Solution

  • As I understand each trip is a time-series of (x,y) coordinates. I would suggest following design of schema:

    Row key = shardKey + tripId + timestamp, and each row has x and y columns. Shard key can be (tripId % number of regions), which prevents hot spotting. This will allow to retrieve data for each trip via single scan from one region.