Search code examples
rduplicatesspotfire

How to find the consecutive values in a column based on another group column


I am trying to find consecutive duplicate records based on each group and flag it as consecutive.

Below is an example i need to check the Values for one group and check if it is repeated in the next group, please provide your suggestion in r i am beginner.

enter image description here

i have tried with ranking in spotfire but not able to achieve the solution.


Solution

  • This worked for my test cases:

    case  
    when ((Max(DenseRank([Group])) over ([Values]) - Min(DenseRank([Group])) over ([Values]) + 1)=Count(DenseRank([Group])) over ([Values])) and (Count(DenseRank([Group])) over ([Values])>1) then 'Consecutive' 
    when Count(DenseRank([Group])) over ([Values])=1 then Null
    else 'Repeated'
    end
    

    or broken down with intermediate columns:

    case  
    when ([dRank]=[Count1]) and ([Count1]>1) then 'Consecutive' 
    when [Count1]=1 then Null
    else 'Repeated'
    end
    

    where:

    [Rank2]: Max(DenseRank([Group])) over ([Values]
    
    [Rank1]: Min(DenseRank([Group])) over ([Values]
    
    [dRank]: [Rank2] - [Rank1] + 1
    
    [Count1]: Count(DenseRank([Group])) over ([Values]
    

    The idea being that there are three possible cases:

    a) there is no gap between the max group and min Group for each value, and there are more than one associated Group

    b) there is only one associated Group

    c) everything else.

    I am still assuming that Groups that appear consecutively are indeed consecutive, regardless of the group name.