Search code examples
rdataframedata-manipulation

Dataframe new column based on other columns and groupby in r


I have a question regarding simple data frame manipulation in R. I have the following df table (with more rows of course):

A_ID B_ID C_ID Value
1 1 1 2
1 2 1 1
2 3 3 0
2 4 3 3

And I would like to have the following table:

A_ID B_ID C_ID Value Value_Equal Value_NotEqual
1 1 1 2 2 1
1 2 1 1 2 1
2 3 3 0 0 3
2 4 3 3 0 3

So its like a group_by for A_ID, I want to check for each unique(A_ID) if B_ID=C_ID. If this is true, I want to have the Value for Value_Equal(Equal here means B_ID=C_ID), but not only for the row, but rather for the the other row with the same A_ID.

If its False, I want to have the Value for the column "Value_NotEqual", and again not just for the row, but rather for the other row with the same A_ID..

I hope it is clear what I mean. If you have any questions regarding my problem task, just ask. Thanks in advance!


Solution

  • Assuming there are only at most one case, grouped by 'A_ID', extract the 'Value' based on the logical expression (B_ID == C_ID or B_ID != C_ID) to create the 'Value_Equal/Value_NotEqual columns)

    library(dplyr)
    df1 %>% 
      group_by(A_ID) %>%
      mutate(Value_Equal = Value[B_ID == C_ID][1],
          Value_NotEqual = Value[B_ID != C_ID][1]) %>%
      ungroup
    

    -output

    # A tibble: 4 × 6
       A_ID  B_ID  C_ID Value Value_Equal Value_NotEqual
      <int> <int> <int> <int>       <int>          <int>
    1     1     1     1     2           2              1
    2     1     2     1     1           2              1
    3     2     3     3     0           0              3
    4     2     4     3     3           0              3
    

    data

    df1 <- structure(list(A_ID = c(1L, 1L, 2L, 2L), B_ID = 1:4, C_ID = c(1L, 
    1L, 3L, 3L), Value = c(2L, 1L, 0L, 3L)), class = "data.frame", 
    row.names = c(NA, 
    -4L))