Search code examples
rif-statementdplyrmutate

Trying to find conditional code that subtracts value from one variable from value in another variable, and inputs result into new variable


I want to run a code that checks value in variable x, and if conditions are met takes value from variable y and subtracts it from variable z, and inputs it into variable q. If conditions are not met then looks at other conditions and subtracts z from y if they are met. If none of the conditions are met then retun NA.

This is my first time coding in R so I might just be overlooking the obvious, if that is the case I am sorry. I have made a genuine effort to try and solve it myself, I have tried googling it and working on it for a few days and I'm just stuck.

Ideally i want it to look like this

df <- data.frame (x  = c(1, 5, 3, 6, 2, 4, 7),
                  y = c(23, 34, 29, 26, 38, 54, 89),
                  z = c(5, 45, 12, 56, 28, 67, 90))
q < - if x >= 1 & x <= 3 then y - z
  if x >= 5 & x <= 7 then z - y
  if x none of these, then assigned NA or 0


    x   y   z   q
1   1  23   5  18
2   5  34  45  11
3   3  29  12  17
4   6  26  56  30
5   2  38  28  10
6   4  54  67  NA
7   7  89  90   1

I originaly tried the mutate() functin

library(dplyr)
df_1 <- df %>%
mutate(q <- df ( 
   df$x >= 1 & df$x <= 3 ~ y - z,
   df$x >= 5 & df$x <= 7 ~ x - y,
   TRUE ~ "NA"
))``

but got this error

dplyr:::mutate.data.frame(...)
dplyr:::mutate_cols(.data, dplyr_quosures(...), caller_env = caller_env())
mask$eval_all_mutate(quo)

I got mutate function to work with characters, but I want a value calculated based on two variables put into the new variable.

I tried case_when

    df1 <- df %>% 
      mutate(
        q = case_when( 
         x >= 1 & x <= 3, y - z ,
          x >= 5 & x <= 7, z - y,
          TRUE   ~ "NA"
        )
      )`

got error
dplyr::case_when(...)

I thought maybe ifelse() could work

    `q <- c (if (df$x >= 1 & df$x <= 3) {
      z - y
    } else if (df$x >= 5 & df$x <= 7) {
      y - z
    } else {"na"})`

then i get Error in if (anes_df$prty_id >= 1 & anes_df$prty_id <= 3) { : the condition has length > 1

I thought maybe it would be possible to create a minus function to be used in the ifelse code, but I'm unable to find a solution for that.

Solution

  • You can chain together a couple of ifelse()

    df <- data.frame (x  = c(1, 5, 3, 6, 2, 4, 7),
                  y = c(23, 34, 29, 26, 38, 54, 89),
                  z = c(5, 45, 12, 56, 28, 67, 90))
    
    
    
    ifelse((df$x >= 1 & df$x <= 3), df$z - df$y, 
             ifelse((df$x >= 5 & df$x <= 7), df$y - df$z, NA))
    

    case_when would also work but getting the final TRUE functions is not straightforward.
    From the help:

    All RHS values need to be of the same type. Inconsistent types will throw an error.
    This applies also to NA values used in RHS: NA is logical, use typed values like NA_real_, NA_complex, NA_character_, NA_integer_ as appropriate.

    df %>% mutate( 
       q=case_when( 
          x >= 1 & x <= 3 ~ z - y ,
          (x >= 5) & (x <= 7) ~  y - z,
          TRUE   ~ NA_real_
       )
    )