Search code examples
javascripttime-seriesindexeddbdexie

Efficient storage and retrieval of time series data with indexedDb


I am looking at storing and retrieving time-series data with javascript in a web browser. I am expecting to have between 500 - 5000 floating point items to trend at a once a second interval. When storing each item would have a unique tag name, the same timestamp (Date.now()), and a floating point value.

When retrieving the data I am usually interested in getting an array of values for a single tag name where the timestamps fall within a specified range.

My issue is that I don't understand the indexedDb data model and API. Is it possible to store all my new values (generated once a second for all 1000 tags) with a single call?

I have experimented a little bit with Dexie as a wrapper around indexedDb and this was my test code:

async function start() {

  // Define database

  await Dexie.delete('trendDatabase');
  var db = new Dexie("trendDatabase");
  db.version(1).stores({
    trends: '++id,trendId,timestamp,value'
  });

  console.log ("Using Dexie v" + Dexie.semVer);

  // Query Database
  var result1 = await db.open();

  //add 1000 values for two trends

  var trendId1 = "FI-100";
  var trendId2 = "FI-200";
  var t1 = Date.now(); 
  for (var i=0; i<1000; i++) {
      var timestamp1 = t1 - (1000 + i) * 1000;
      var value1 = Math.sin(i/10)*8;
      var storeResult = await db.trends.add({trendId: trendId1, timestamp: timestamp1, value: value1});
      var value2 = Math.cos(i/100)*4;
      var storeResult = await db.trends.add({trendId: trendId2, timestamp: timestamp1, value: value1});
  }
  var t2 = Date.now();
  console.log("Took: ", t2 - t1);
  var t3 = Date.now();
  console.log("Took: ", t3 - t2);
  console.log(result3); 
  var result4 = db.delete();

}

The biggest problem is how slow the storage is. On my laptop it takes 52 seconds to store 2000 points of data (but only 11 ms to retrieve 1000 points). On my desktop with optane storage it takes about 2 seconds to store. In either case, this is far too slow. I need to be able to store once a second so I need a storage speed of < 1000 ms and ideally < 100 ms.

Are there better ways to structure time-series data in indexedDb?

One idea I had was that I could store the most recent 100 points of data for ALL trends at once and then write blocks of 100 points per individual trend (in a rotating order through every trend) to reduce the number of write calls by a factor of 100. This would also make so for retrieving recent data (in the last 100 seconds) I would end up getting values for all 1000 tags when I only wanted a handful of them, so I would have to do some work filtering out the non-relevant data. This approach is probably workable, but I wanted to query the community before I went through all the trouble to see if there are better approaches or any other projects/libraries for doing such a thing.


Solution

  • Good you found Table.bulkAdd() to use instead of Table.add() (saw your own response to the question)

    For the querying part, I understood you want to include both tag and timeRange in your query.

    I suppose the tag part is the same as trendId?

    If so, I would suggest you to migrate the schema to use a compound index of [trendId+timestamp] for more efficient queries.

    Would also recommend you to keep the db instance outside the start() function. Declare it in a module and export it.

    db.js

    // db.js
    export const db = new Dexie('trendDatabase');
    
    db.version(1).stores({
      trends: '++id,trendId,timestamp,value'
    });// (Keep version 1 if you or your users have it installed)
    
    // Migrate schema:
    db.version(2).stores({
      trends: '++id, [trendId+timestamp]'
    });
    

    query.js

    // query.js
    import { db } from './db';
    
    export function query(trendId, timeFrom, timeTo) {
      return db.trends
        .where('[trendId+timestamp]')
        .between([trendId, timeFrom], [trendId, timeTo])
        .toArray();
    }
    

    log.js

    import { db } from './db';
    
    export async function log(trends) {
      await db.trends.bulkAdd(trends);
    }
    

    As you can see, you only need to index the properties that you will use in a where-clause. That doesn't mean you can store other properties with each object.