Search code examples
rdplyrtidyversedata-manipulation

Check if the value exists in any other columns with Tidyverse


Sample data

df <- diag(1:8) %>% 
  as_tibble() %>%  
  mutate(A = c(10, 5, 11, 4, 6, 9, 65, 8))

# A tibble: 8 x 9
     V1    V2    V3    V4    V5    V6    V7    V8     A
  <int> <int> <int> <int> <int> <int> <int> <int> <dbl>
1     1     0     0     0     0     0     0     0    10
2     0     2     0     0     0     0     0     0     5
3     0     0     3     0     0     0     0     0    11
4     0     0     0     4     0     0     0     0     4
5     0     0     0     0     5     0     0     0     6
6     0     0     0     0     0     6     0     0     9
7     0     0     0     0     0     0     7     0    65
8     0     0     0     0     0     0     0     8     8

The desired output would be

# A tibble: 8 x 10
     V1    V2    V3    V4    V5    V6    V7    V8     A result
  <int> <int> <int> <int> <int> <int> <int> <int> <dbl> <lgl> 
1     1     0     0     0     0     0     0     0    10 FALSE 
2     0     2     0     0     0     0     0     0     5 TRUE  
3     0     0     3     0     0     0     0     0    11 FALSE 
4     0     0     0     4     0     0     0     0     4 TRUE  
5     0     0     0     0     5     0     0     0     6 TRUE  
6     0     0     0     0     0     6     0     0     9 FALSE 
7     0     0     0     0     0     0     7     0    65 FALSE 
8     0     0     0     0     0     0     0     8     8 TRUE

AND/OR ... A takes the value that is non-zero value in any other columns, such that the output would be:

# A tibble: 8 x 9
     V1    V2    V3    V4    V5    V6    V7    V8     A
  <int> <int> <int> <int> <int> <int> <int> <int> <int>
1     1     0     0     0     0     0     0     0     1
2     0     2     0     0     0     0     0     0     2
3     0     0     3     0     0     0     0     0     3
4     0     0     0     4     0     0     0     0     4
5     0     0     0     0     5     0     0     0     5
6     0     0     0     0     0     6     0     0     6
7     0     0     0     0     0     0     7     0     7
8     0     0     0     0     0     0     0     8     8

How can this be achieved with tidyverse?


Solution

  • You may try

    library(dplyr)
    df %>%
      mutate(result = rowSums(across(everything() & - "A", ~A %in% .x)) > 0)
    
         V1    V2    V3    V4    V5    V6    V7    V8     A result
      <int> <int> <int> <int> <int> <int> <int> <int> <dbl> <lgl> 
    1     1     0     0     0     0     0     0     0    10 FALSE 
    2     0     2     0     0     0     0     0     0     5 TRUE  
    3     0     0     3     0     0     0     0     0    11 FALSE 
    4     0     0     0     4     0     0     0     0     4 TRUE  
    5     0     0     0     0     5     0     0     0     6 TRUE  
    6     0     0     0     0     0     6     0     0     9 FALSE 
    7     0     0     0     0     0     0     7     0    65 FALSE 
    8     0     0     0     0     0     0     0     8     8 TRUE  
    

    If starts with

         V1    V2    V3    V4    V5    V6    V7    V8     A
      <int> <int> <int> <int> <int> <int> <int> <int> <dbl>
    1     1     0     0     0     0     0     0     0    10
    2     0     2     0     0     0     0     0     0     5
    3     0     0     3     0     0     0     0     0    11
    4     0     0     0     4     0     0     0     0     4
    5     0     0     0     0     5     0     0     0     6
    6     0     0     0     0     0     6     0     0     9
    7     0     0     0     0     0     0     7     0    65
    8     0     0     0     0     0     0     0     8     8
    

    then

    df %>%
      mutate(A = rowSums(.) - A)
    

    If starts with

         V1    V2    V3    V4    V5    V6    V7    V8
      <int> <int> <int> <int> <int> <int> <int> <int>
    1     1     0     0     0     0     0     0     0
    2     0     2     0     0     0     0     0     0
    3     0     0     3     0     0     0     0     0
    4     0     0     0     4     0     0     0     0
    5     0     0     0     0     5     0     0     0
    6     0     0     0     0     0     6     0     0
    7     0     0     0     0     0     0     7     0
    8     0     0     0     0     0     0     0     8
    

    then

    df %>%
      mutate(A = rowSums(.))