Search code examples
rfunctiontapply

Apply a function to dataframe - arguments must have the same length


I got a huge dataset x of two parameters: q and ssc. They are grouped by he value. Every he is a loop. There is a big amount of groups (≈100).

x <- data.frame(q = c(1.62, 1.82,2.09, 2.48, 2.19, 1.87, 1.67,1.44,1.8,2.52,2.27,1.83,1.68,1.54),
                ssc = c(238, 388, 721, 744, 307, 246, 222,216,228,1169,5150,2217,641,304),
                he = c(1,1,1,1,1,1,1,2,2,2,2,2,2,2))

plot(ssc~q, type = "o", group = he, data = x)

I want to apply for every group my on functions like foo1:

foo1 <- function(i) {
M <- lm(log(ssc) ~ I(log(q)), data = x)
a <- exp(coef(M)[1])
b <- coef(M)[2]
res <- x$ssc - a*x$q^b
r <- mean(res[1:which.max(x$q)])
f <- mean(res[c((which.max(x$q)+1):length(x$q))])
HI <- r-f
return(HI)
}

In the end get a matrix of two values he, foo1. I was trying to use tapply but cannot figure out how make it use 2 input rows (q and ssc):

  tapply(X = list(x$q, x$ssc), x$he, foo1)

>Error in tapply(X = list(x$q, x$ssc), x$he, foo1) : 
>arguments must have the same length

Solution

  • I made 2 changes to your function. First, you pass i but use x in your function - so I changed x to i in your function. Second, instead of returning a numeric, I add your result to the end of the grouped.data.frame and return that

    foo1 <- function(i) {
        M <- lm(log(ssc) ~ I(log(q)), data = i)
        a <- exp(coef(M)[1])
        b <- coef(M)[2]
        res <- i$ssc - a*i$q^b
        r <- mean(res[1:which.max(i$q)])
        f <- mean(res[c((which.max(i$q)+1):length(i$q))])
        i$HI <- r-f
        return(i)
    }
    

    Use group_by(...) %>% do(function(...)) to apply the function by group

    x %>%
      group_by(he) %>%
      do(foo1(.)) %>%
      ungroup()
    
    # A tibble: 14 x 4
    # Groups: he [2]
           # q   ssc    he     HI
       # <dbl> <dbl> <dbl>  <dbl>
     # 1  1.62  238.    1.   207.
     # 2  1.82  388.    1.   207.
     # 3  2.09  721.    1.   207.
     # 4  2.48  744.    1.   207.
     # 5  2.19  307.    1.   207.
     # 6  1.87  246.    1.   207.
     # 7  1.67  222.    1.   207.
     # 8  1.44  216.    2. -1961.
     # 9  1.80  228.    2. -1961.
    # 10  2.52 1169.    2. -1961.
    # 11  2.27 5150.    2. -1961.
    # 12  1.83 2217.    2. -1961.
    # 13  1.68  641.    2. -1961.
    # 14  1.54  304.    2. -1961.