Search code examples
expressioncalculated-columnsspotfire

matching for group (Expressions) in same column in Spotfire


Regarding the previous question which I have posted: calculation the difference for same column for the specific rows in Spotfire

I have a new problem for it, below is the sample:

enter image description here

The new function that i want to realize is,

  • searching for the next nearest TYPE=0 for TYPE=1
  • if the TYPE=1 has nearest TYPE=0, then mark it as 'T' in a new calculated column, otherwise as NULL

Data Rules:

  • Status column contains{1,2} in order, the default value of the null space is same as the last nearest Status value above it.
  • Type column contains only 0 and 1 randomly

output should be like this:

enter image description here

the solution what i have tried:

    If(([type]=1) and (first([type]) OVER (intersect(previous([type]),AllNext([status])))=0),"T",Null)

it looks fine, but the problem is in each status group, for example the last TYPE=1(the 5th row) in the first status=1 group, it has not next nearest TYPE=0, so the judgement would be Null. But based on the code, it is T! :(

any suggestion and idea for it? thanks a lot'! PS: some details:

  • the first Type Value of the status is NULL
  • Other null space in the status column can be filled as below, if it is helpful for the expression :):

enter image description here


Solution

  • Ok, this one really tested my limits and I'm curious if it will scale. It works for your data, given where you have NULL. It took a few hours to figure out.

    1. Insert a calculated column RowId() and name it RowNum
    2. Insert a calculated column RankReal([status],"ties.method=first") and name it Rank
    3. Insert a calculated column If((first([status]) over (Previous([RowNum])) - First([status]) over ([RowNum]))=0,[Rank] - Max([RowNum]) OVER (Intersect([status],AllPrevious([RowNum])))) and name it GroupOfTypes
    4. Inert a calculated column If([type]>Min([type]) over (Intersect([GroupOfTypes],AllNext([RowNum]))),"T") and name it Marking. This is the row you really care about.

    Results

    enter image description here

    EXPLANATION

    RankReal([status],"ties.method=first")

    This is done to essentially create a pseudo rownumber based on a segment of statuses. You'll see it ranks based on the status, sequentially. It's the first step in my method to group your data.

    If((first([status]) over (Previous([RowNum])) - First([status]) over ([RowNum]))=0,[Rank] - Max([RowNum]) OVER (Intersect([status],AllPrevious([RowNum]))))

    The first part of this, (first([status]) over (Previous([RowNum])) - First([status]) over ([RowNum]))=0 is evaluating if the previous row's [status] column is the same as the current row. If it is, it returns a boolean TRUE value. If it is the same as it's previous row, we know that it belongs in the same block / grouping, so we do some math to label this column with the same value for the entire block. That is [Rank] - Max([RowNum]) OVER (Intersect([status],AllPrevious([RowNum]))). Therefore, each row within our grouping will equate to the same value.

    If([type]>Min([type]) over (Intersect([GroupOfTypes],AllNext([RowNum]))),"T")

    Lastly we evaluate if the [type] is larger than the mnimal [type] over all the next rows, based on the rownumber. This limits the data we focus on to those where [type] = 1 without actually filtering the rows, while only looking forward in the dataset. If this is true we flag it with T.