Search code examples
sqlimpala

Converting monthly to daily data


I have monthly data that I would like to transform to daily data. The data looks like this. The extraction_dt is in date format.

isin extraction_date yield
001 2013-01-31 100
001 2013-02-28 110
001 2013-03-31 105
... ... ...
002 2013-01-31 200
... ... ...

And I would like to have something like this

isin extraction_dt yield
001 2013-01-01 100
001 2013-01-02 100
001 2013-01-03 100
.. ..... ...
001 2013-02-01 110
... ... ...

I tried the following code but it does not work. I get the error message AnalysisException: Could not resolve table reference: 'cte'. How would you convert monthly to daily data?

with cte as
(select isin, extraction_dt, yield
 from datashop
 union all
 select isin, extraction_dt, dateadd(d, 1, extraction_dt) AS date_dt, yield
 from cte
 where datediff(m,date_dt,dateadd(d, 1, date_dt))=0
)
select isin, date_dt,
       1.0*isin / count(*) over (partition by isin, date_dt) AS daily_yield
from cte
order by 1,2

Solution

  • I can suggest easy solution.

    1. generate a date series
    2. match it with your data so it gets repeated. So, here is the SQL you can use for Impala.
    select isin, extraction_dt, a.dt AS date_dt, yield
    from 
    datashop d,
     (
    select now() - INTERVAL (a.a + (10 * b.a) + (100 * c.a) + (1000 * d.a) ) DAY as dt
        from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
        cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
        cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
        cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as d
    ) a
    WHERE
    from_timestamp(a.dt,'yyyy/MM') =from_timestamp(d.extraction_dt,'yyyy/MM')
    order by 1,2,3
    

    the alias a is going to generate a series of dates.
    WHERE - this clause will restrict to the month of extraction_dt. and you will get all possible values for a month.
    ORDER BY - will show a nice output.