Search code examples
sqloracletime-seriesvertica

Duplicated and Missing Values in Vertica Time Series Analytics Query


currently I am facing a problem with genereting a list of dates at a specific start date in Vertica. Based on this article https://forum.vertica.com/discussion/240532/use-time-series-analytics-to-generate-a-list-of-dates-at-a-specific-start-date I used get the data I wanted. But recently, out of nowhere, my query generated dupblicate dates (months) and the last month (march) is missing completly.

My query:

SELECT
        LEFT(ADD_MONTHS((EXTRACT (YEAR FROM ts::DATE) || '-' || EXTRACT(MONTH FROM ts::DATE) ||
'-01')::DATE, 1)::varchar, 7) as validity_month
    FROM (
        SELECT ADD_MONTHS(CURRENT_DATE(), -36)::TIMESTAMP as tm -- take the last 36 months
        UNION ALL 
        SELECT CURRENT_DATE()::TIMESTAMP
    ) as t TIMESERIES ts as '1 MONTH' OVER (ORDER BY t.tm) -- build a timeseries for every month

The last 36 months until January 2023 just work fine and provide the data series I needed. Starting from January 2023 I get the following data:

validity_month
'2023-01'
'2023-01'
'2023-02'
'2023-04'
'2023-04'

March is missing and I get duplicated values. Does anyone have a clue what the problem is and how to fix it?

Thanks alot and best regrads!

Tried to change the query and expected I would generate a data series starting from a given month until today without missing and duplicated months


Solution

  • To solve the problem with month-long intervals, try using TIMESERIES just to get a series of integers, then CROSS JOIN with that int list, and use ADD_MONTHS() :

    WITH
      monthcount(monthcount) AS (SELECT 6) -- take the last 6 months
    , lim(dt) AS (
      SELECT TRUNC(ADD_MONTHS(CURRENT_DATE(), -monthcount),'MONTH') FROM monthcount
      UNION ALL 
      SELECT TRUNC(CURRENT_DATE(),'MONTH')
    )
    , ts(ts) AS (
      SELECT ts FROM lim
      TIMESERIES ts as '1 MONTH' OVER (ORDER BY dt::TIMESTAMP) 
    ) 
    , i(i) AS ( SELECT ROW_NUMBER() OVER(ORDER BY ts) FROM ts) 
    , minmonth(minmonth) AS (SELECT MIN(dt) FROM lim)
    SELECT
      ADD_MONTHS(minmonth,i) AS validity_month
    , i
    FROM i CROSS JOIN minmonth CROSS JOIN monthcount
    WHERE i <= monthcount
    ORDER BY 1
    ;
    
    validity_month i
    2022-11-01 1
    2022-12-01 2
    2023-01-01 3
    2023-02-01 4
    2023-03-01 5
    2023-04-01 6