Suppose I have a table as follows:
Row Volume
1 10000
2 8000
3 0.01
4 0
5 5000
6 0
Now, if I were to use the LAG()
function in SQL Server, I believe the following table will result:
Row Volume LAG(Volume)
1 10000 NULL
2 8000 10000
3 0.01 8000
4 0 0.01
5 5000 0
6 0 5000
However, this is not exactly what I am looking for, so allow me to provide some background as to why I need the LAG(Volume)
column. The reason for it is because small values, such as 0.01
or 0
in my example, are likely to be data processing errors, so I want to override them with the last non-0.01
or non-0
value and then update the table by a JOIN
. I believe this requires me to first come up with the LAG(Volume)
column, but I would need the LAG()
function to ignore some values. In my example, they would be 0.01
and 0
. I guess, one could also say that I would like the lagged value to depend on what the lagged value actually is. In other words, for my example, I would like the following table to be produced instead:
Row Volume LAG(Volume)
1 10000 NULL
2 8000 10000
3 0.01 8000
4 0 8000
5 5000 8000
6 0 5000
Note that, for Row 6, the lag should be 5000 (and not 8000) as, although its volume is 0, its lag is neither 0 nor 0.01.
Can this be done?
Perhaps if anyone has alternatives as to how I can update these values to the last non-0.01
or non-0
value without having to create an intermediate lag column, please let me know as well.
Any suggestions will be greatly appreciated!
On earlier versions you can use a concatenation approach to simulate IGNORE NULLS
. 2016 DB Fiddle.
It uses MAX
with a window of ORDER BY [Row] ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
Assuming Row
is a positive integer the value going into the MAX
is a concatenation of Row
and Volume
constructed such that later Row
numbers will be the MAX
- but these are nulled out in the cases that you want to ignore and MAX
automatically disregards those.
The Volume
is then extracted out with SUBSTRING
and CAST
back to decimal
- if you are not in fact using decimal(12,2)
then change this to what you are using.
SELECT
*,
CAST(SUBSTRING(MAX(RIGHT(CONCAT('0000000000', [Row]),10) + CASE WHEN Volume > 0.01 THEN CAST(Volume AS VARCHAR(20)) END) OVER (ORDER BY [Row] ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 11, 20) AS decimal(12,2))
FROM YourTable