Search code examples
aggregatetimescaledb

Does TimescaleDB work with historical data?


I can't seem to find this on their documentation at all. I use TimescaleDB to calculate a bunch of intermediate data products on my real time production server; hypertables, continuous aggregates etc. I've started to experiment with some other historical datasets on a dev server and I can't seem to get the cont.agg's to automatically update.

My testing methodology is:

  1. Spin up new docker container with fresh mounted volume.
  2. Run SQL to create empty tables, turn them into hypertables, and define cont.agg's from them.
  3. Insert new data into hypertable, run query on cont.agg views and discover they're empty
  4. run refresh_continuous_aggregate('my_cagg', first date, last date); to populate aggregate
  5. Insert more historical data
  6. Query aggregate to see that it hasn't updated

I have all the policies set and I'm seeing that they are getting run in the timescaledb jobs table but I'm not sure how to check what dates it thinks it should be looking for.

Does this only work on real time data? I can't have historical data that gets inserted in order?


Solution

  • PEBKAC

    The issue was that I was misunderstanding how the policies worked. I had my start time set to an interval of 1 month. I thought that it would use the data for 1 month before the high water mark but instead it was checking for new data 1 month before now.

    So, dropped policies, reinitialised the policies with start_time == Null and all is well