I am currently testing out BigTable at the moment to see if is something we will use. We currently use CloudSql with Postgres 9.6 with the current schema of;
id, sensor_id, time, value
Most of our queries we query for data between a range, something like this
SELECT
*
FROM
readings
WHERE
sensor_id IN(7297,7298,7299,7300)
AND time BETWEEN '2018-07-15 00:00:00' AND '2019-07-15 00:00:00'
ORDER BY
time, sensor_id
Each sensor can have readings every 10mins or so, so that's a fair bit of data. At last check, we have 2 billion records, which is increasing a lot each day.
For BigTable I am importing with a row key of
readings#timestamp#sensorId
so something like this readings#20180715000000#7297
So far seems so good.
To query a range (using node) I am doing this
const fromDate = '20180715000000'
const toDate = '20190715000000'
const ranges = sensorIds.map(sensorId => {
return {
start: `readings#${fromDate}#${sensorId}`,
end: `readings#${toDate}#${sensorId}`,
}
});
const results = [];
await table.createReadStream({
column: {
cellLimit: 1,
},
ranges
})
.on('error', err => {
console.log(err);
})
.on('data', row => {
results.push({
id: row.id,
data: row.data
})
})
.on('end', async () => {
console.log(` ${results.length} Rows`)
})
My understanding of this would be that the results would be similar to the sql query above, but it seems to be returning for all sensor ids across the date range, not by the ones specified within the query.
My questions;
thanks in advance for your advice.
The problem is that you are setting up your ranges
variable in a wrong way and Big Table is getting lost because of that, try doing the following:
const fromDate = '20180715000000'
const toDate = '20190715000000'
const sensorId = sensorIds[0]
const filter = {
column: {
cellLimit: 1,
},
value: {
start: `readings#${fromDate}#${sensorId}`,
end: `readings#${toDate}#${sensorId}`,
}
};
const results = [];
await table.createReadStream({
filter
})
.on('error', err => {
console.log(err);
})
.on('data', row => {
results.push({
id: row.id,
data: row.data
})
})
.on('end', async () => {
console.log(` ${results.length} Rows`)
})
**NOTE: I am getting the first position of sensorIds
which I assume is a list of all the Ids, but you can select any of them. Also this is all untested but should be a good starting point for you.
You can find snippets on the usage of the Node.js Client for BigTable on this Github Repo.