postgresql postgresql-9.3 window-functions

Using window functions to reduce amount of lines implementing data ranges

I want to reduce the amount of lines in a table, which has a lot of duplicated data. My first thought was to use some window function to define the date ranges to save in the table, so that whenever I need this info, the date ranges are just delimiters in the join condition. But then I noticed that some references were overlapping, so, I'm not sure which would be the best way to do it.

I'm using Postgres 9.3.

select  distinct    
    min(obs_date) over (partition by equipment, temperature) as beg_obs_date,
    max(obs_date) over (partition by equipment, temperature) as end_obs_date,
    equipment, 
    temperature
from    
(   select  generate_series('2016-05-01', '2016-05-08', '1 day'::interval)::date as obs_date,
        'FREEZER_1'::varchar as equipment,
        -15.20::real as temperature
    union all   
    select  generate_series('2016-05-09', '2016-05-15', '1 day'::interval)::date as obs_date,
        'FREEZER_1'::varchar as equipment,
        -20.00::real as temperature
    union all

    select  generate_series('2016-05-16', '2016-06-10', '1 day'::interval)::date as obs_date,
        'FREEZER_1'::varchar as equipment,
        -15.20::real as temperature
) sq

I get:

beg_obs_date    end_obs_date    equipment   temperature
2016-05-01      2016-06-10      FREEZER_1   -15,2
2016-05-09      2016-05-15      FREEZER_1   -20

What I want is:

beg_obs_date    end_obs_date    equipment   temperature
2016-05-01      2016-05-08      FREEZER_1   -15,2
2016-05-09      2016-05-15      FREEZER_1   -20
2016-05-16      2016-06-10      FREEZER_1   -15,2

Any idea?

Thank you!

Solution

Use row_number() to distinguish consecutive series. The data (slightly simplified) with packet added:

with the_data as (
    select  generate_series('2016-05-01', '2016-05-03', '1 day'::interval)::date as obs_date,
        'FREEZER_1'::varchar as equipment,
        -15.20::real as temperature
    union all   
    select  generate_series('2016-05-04', '2016-05-05', '1 day'::interval)::date as obs_date,
        'FREEZER_1'::varchar as equipment,
        -20.00::real as temperature
    union all
    select  generate_series('2016-05-06', '2016-05-08', '1 day'::interval)::date as obs_date,
        'FREEZER_1'::varchar as equipment,
        -15.20::real as temperature
    )
select 
    *, 
    row_number() over (partition by equipment, temperature order by obs_date)- row_number() over (order by obs_date) as packet
from the_data

  obs_date  | equipment | temperature | packet 
------------+-----------+-------------+--------
 2016-05-01 | FREEZER_1 |       -15.2 |      0
 2016-05-02 | FREEZER_1 |       -15.2 |      0
 2016-05-03 | FREEZER_1 |       -15.2 |      0
 2016-05-04 | FREEZER_1 |         -20 |     -3
 2016-05-05 | FREEZER_1 |         -20 |     -3
 2016-05-06 | FREEZER_1 |       -15.2 |     -2
 2016-05-07 | FREEZER_1 |       -15.2 |     -2
 2016-05-08 | FREEZER_1 |       -15.2 |     -2
(8 rows)

In max() and min() use packet instead of temperature:

with the_data as (
    select  generate_series('2016-05-01', '2016-05-03', '1 day'::interval)::date as obs_date,
        'FREEZER_1'::varchar as equipment,
        -15.20::real as temperature
    union all   
    select  generate_series('2016-05-04', '2016-05-05', '1 day'::interval)::date as obs_date,
        'FREEZER_1'::varchar as equipment,
        -20.00::real as temperature
    union all
    select  generate_series('2016-05-06', '2016-05-08', '1 day'::interval)::date as obs_date,
        'FREEZER_1'::varchar as equipment,
        -15.20::real as temperature
    )
select distinct    
    min(obs_date) over (partition by equipment, packet) as beg_obs_date,
    max(obs_date) over (partition by equipment, packet) as end_obs_date,
    equipment, 
    temperature
from (
    select 
        *, 
        row_number() over (partition by equipment, temperature order by obs_date)- row_number() over (order by obs_date) as packet
    from the_data
) s
order by 1;

 beg_obs_date | end_obs_date | equipment | temperature 
--------------+--------------+-----------+-------------
 2016-05-01   | 2016-05-03   | FREEZER_1 |       -15.2
 2016-05-04   | 2016-05-05   | FREEZER_1 |         -20
 2016-05-06   | 2016-05-08   | FREEZER_1 |       -15.2
(3 rows)