I need to get the values of a table by validity range. Currently I make a group by the columns of the table and I make a min and max about the year; but this gives me the wrong result for the case of the example. I would need to solve it with sentence sql or with pentaho
Sample data:
year validity Value1 Value2
2004 A B
2006 A C
2007 A B
2008 A B
SQL:
SELECT
min(anio), max(anio), value1, value2
FROM tabla
GROUP BY
value1, value2
Wrong result:
year min year max Value1 Value2
2004 2008 A B
2006 2006 A C
Expected result:
year min year max Value1 Value2
2004 2004 A B
2007 2008 A B
2006 2006 A C
Please help on this issue.
You can combine the LAG()
and SUM()
window functions to separate the rows into groups according to your logic. Once separated, computing the min and max values is trivial.
For example:
select
min(year_validity) as year_min,
max(year_validity) as year_max,
min(value1) as value1,
min(value2) as value2
from (
select
*,
sum(init) over(order by year_validity) as grp
from (
select *,
case when not ( (value1, value2) = (
lag(value1) over(order by year_validity),
lag(value2) over(order by year_validity)
) ) then 1 else 0 end as init
from tabla
) x
) y
group by grp
order by value1, value2
Result:
year_min year_max value1 value2
-------- -------- ------ ------
2004 2004 A B
2007 2008 A B
2006 2006 A C
For the record, the data script for this case is:
create table tabla (
year_validity int,
value1 varchar(10),
value2 varchar(10)
);
insert into tabla (year_validity, value1, value2) values (2004, 'A', 'B');
insert into tabla (year_validity, value1, value2) values (2006, 'A', 'C');
insert into tabla (year_validity, value1, value2) values (2007, 'A', 'B');
insert into tabla (year_validity, value1, value2) values (2008, 'A', 'B');