Search code examples
rfunctionsyntaxformula

Assistance in creating formula to replace variables


I have a dataset like this:

db <- structure(list(group = c(1, 1, 1, 2, 2, 2, 2), female = c(1, 
0, 1, 1, 1, 0, 1)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-7L))

Where female=1 means that the patient is female, female=0 means is male.

Then, I have this formula. Suppose x=frequency of females in group 1, y=frequency of females in group 2 [which I calculated manually]

formula <- function(x,y){
  x <- x/100 #frequency should not be expressed as percentage but :100
  y <- y/100
  z <- x*(1-x)
  t <- y*(1-y)
  k <- sum(z,t)
  l <- k/2
  
return((x-y)/sqrt(l))
  
}

Instead of writing manually the frequency, for example, of females in group 1 (x) and in group 2 (y), I would like to change this function so that I can write something like: formula(db$female, db$group) and automatically in this formula R calculates the frequency of females and uses it in the formula


Solution

  • Based on the description of the problem, I have slightly modified the function, this should work as described, provided the groups are either 1 or 2.

    formula <- function(input1, input2){
      
      x <- sum(input1[input2 == 1]) / length(input1[input2 == 1])
      y <- sum(input1[input2 == 2]) / length(input1[input2 == 2])
      
      x <- x/100 #frequency should not be expressed as percentage but :100
      y <- y/100
      z <- x*(1-x)
      t <- y*(1-y)
      k <- sum(z,t)
      l <- k/2
      
      return((x-y)/sqrt(l))
    }
    

    Due to x and y appearing multiple times, and to avoid confusion, I have changed the arguments of the function to input1 and input2.