Search code examples
rdatabaselogical-operators

Identify if a value matches the id in a group it was previously matched with


I have a dataframe with data grouped by site and an id. I would like to create a new column with a logical value indicating if, within each site, the id matches the angle in a previous row.

other.dat <- c(1:12)
site <- c(rep(1,6), rep(2,6))
id <- c(1.1,1.2,1.3,1.1,1.2,1.3,2.1,2.2,2.3,2.1,2.2,2.3)
angle <- c(65,250,150,65,250,150,0,240,150,345,205,100)
data.frame(other.dat,site,id,angle)

   other.dat site  id angle
1          1    1 1.1    65
2          2    1 1.2   250
3          3    1 1.3   150
4          4    1 1.1    65
5          5    1 1.2   250
6          6    1 1.3   150
7          7    2 2.1     0
8          8    2 2.2   240
9          9    2 2.3   150
10        10    2 2.1   345
11        11    2 2.2   205
12        12    2 2.3   100

Unlike the example given in a previous question (Identify if a value is repeated or not by groups in R), angle does not need to be unique to site for the the logical variable to be TRUE and that the first unique id will always be TRUE. For example:

other.dat <- c(1:12)
site <- c(rep(1,6), rep(2,6))
id <- c(1.1,1.2,1.3,1.1,1.2,1.3,2.1,2.2,2.3,2.1,2.2,2.3)
angle <- c(65,250,150,65,250,150,0,240,150,345,205,100)
same.angle <- c(T,T,T,T,T,T,T,T,T,F,F,F)
data.frame(other.dat,site,id,angle, same.angle)

   other.dat site  id angle same.angle
1          1    1 1.1    65       TRUE
2          2    1 1.2   250       TRUE
3          3    1 1.3   150       TRUE
4          4    1 1.1    65       TRUE
5          5    1 1.2   250       TRUE
6          6    1 1.3   150       TRUE
7          7    2 2.1     0       TRUE
8          8    2 2.2   240       TRUE
9          9    2 2.3   150       TRUE
10        10    2 2.1   345      FALSE
11        11    2 2.2   205      FALSE
12        12    2 2.3   100      FALSE

I have tried using df <- within(df, same.angle <- as.logical(angle, id, FUN = function(x) length(unique(x))==1))) but for a reason I dont yet understand, only row 7 is returned as 'FALSE'.

Thanks in advance for your help!


Solution

  • Try this. You can group by the desired variables, then use lag() to replicate the row comparison and then compare to reach the expected output:

    library(dplyr)
    #Code
    new <- df %>% group_by(site,id) %>%
      mutate(Var=lag(angle),
             Same.angle=ifelse(is.na(Var),T,ifelse(angle==Var,T,F))) %>%
      select(-Var)
    

    Output:

    # A tibble: 12 x 5
    # Groups:   site, id [6]
       other.dat  site    id angle Same.angle
           <int> <dbl> <dbl> <dbl> <lgl>     
     1         1     1   1.1    65 TRUE      
     2         2     1   1.2   250 TRUE      
     3         3     1   1.3   150 TRUE      
     4         4     1   1.1    65 TRUE      
     5         5     1   1.2   250 TRUE      
     6         6     1   1.3   150 TRUE      
     7         7     2   2.1     0 TRUE      
     8         8     2   2.2   240 TRUE      
     9         9     2   2.3   150 TRUE      
    10        10     2   2.1   345 FALSE     
    11        11     2   2.2   205 FALSE     
    12        12     2   2.3   100 FALSE