Search code examples
rdplyrmutate

Creating a new column using two columns from the data frame in R


Below is an example of the data frame

  A    B 
| 0  | NA |
| 1  | NA |
| 1  | 0  |
| 1  | 0  |
| 1  | 1  |
| 0  | NA |
| 1  | NA |
| 1  | 0  |
| 1  | 0  |
| 1  | 1  |

Using the values in the A and B columns how can we achieve a new column that follows the following conditions :

if A = 0 AND B = NA then put '0' in new col
if A = 1 AND B = 0 then put '0' in new col
if A = 1 and B = 1 then put '1' in new col
if A = 1 and B = NA then put 'NA' in new col

An ideal new column would look like the following:

  A    B    new_col
| 0  | NA | 0  
| 1  | NA | NA
| 1  | 0  | 0
| 1  | 0  | 0
| 1  | 1  | 1
| 0  | NA | 0  
| 1  | NA | NA
| 1  | 0  | 0
| 1  | 0  | 0
| 1  | 1  | 1

I could not find an appropriate solution for this yet. If something is unclear and needs more explanation, please comment. I appreciate any help you can provide.


Solution

  • From your example, probably you can summarize the logic as below

    # if `A==0`, output `0`, otherwise, yielding values of `B`
    > transform(df, new_col = ifelse(A == 0, 0, B))
       A  B new_col
    1  0 NA       0
    2  1 NA      NA
    3  1  0       0
    4  1  0       0
    5  1  1       1
    6  0 NA       0
    7  1 NA      NA
    8  1  0       0
    9  1  0       0
    10 1  1       1
    

    data

    > dput(df)
    structure(list(A = c(0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L),
        B = c(NA, NA, 0L, 0L, 1L, NA, NA, 0L, 0L, 1L)), class = "data.frame", row.names = c(NA, 
    -10L))