I got a huge dataset x
of two parameters: q
and ssc
. They are grouped by he
value.
Every he
is a loop. There is a big amount of groups (≈100).
x <- data.frame(q = c(1.62, 1.82,2.09, 2.48, 2.19, 1.87, 1.67,1.44,1.8,2.52,2.27,1.83,1.68,1.54),
ssc = c(238, 388, 721, 744, 307, 246, 222,216,228,1169,5150,2217,641,304),
he = c(1,1,1,1,1,1,1,2,2,2,2,2,2,2))
plot(ssc~q, type = "o", group = he, data = x)
I want to apply for every group my on functions like foo1
:
foo1 <- function(i) {
M <- lm(log(ssc) ~ I(log(q)), data = x)
a <- exp(coef(M)[1])
b <- coef(M)[2]
res <- x$ssc - a*x$q^b
r <- mean(res[1:which.max(x$q)])
f <- mean(res[c((which.max(x$q)+1):length(x$q))])
HI <- r-f
return(HI)
}
In the end get a matrix of two values he
, foo1
. I was trying to use tapply
but cannot figure out how make it use 2 input rows (q and ssc):
tapply(X = list(x$q, x$ssc), x$he, foo1)
>Error in tapply(X = list(x$q, x$ssc), x$he, foo1) :
>arguments must have the same length
I made 2 changes to your function. First, you pass i
but use x
in your function - so I changed x
to i
in your function. Second, instead of returning a numeric
, I add your result to the end of the grouped.data.frame and return that
foo1 <- function(i) {
M <- lm(log(ssc) ~ I(log(q)), data = i)
a <- exp(coef(M)[1])
b <- coef(M)[2]
res <- i$ssc - a*i$q^b
r <- mean(res[1:which.max(i$q)])
f <- mean(res[c((which.max(i$q)+1):length(i$q))])
i$HI <- r-f
return(i)
}
Use group_by(...) %>% do(function(...))
to apply the function by group
x %>%
group_by(he) %>%
do(foo1(.)) %>%
ungroup()
# A tibble: 14 x 4
# Groups: he [2]
# q ssc he HI
# <dbl> <dbl> <dbl> <dbl>
# 1 1.62 238. 1. 207.
# 2 1.82 388. 1. 207.
# 3 2.09 721. 1. 207.
# 4 2.48 744. 1. 207.
# 5 2.19 307. 1. 207.
# 6 1.87 246. 1. 207.
# 7 1.67 222. 1. 207.
# 8 1.44 216. 2. -1961.
# 9 1.80 228. 2. -1961.
# 10 2.52 1169. 2. -1961.
# 11 2.27 5150. 2. -1961.
# 12 1.83 2217. 2. -1961.
# 13 1.68 641. 2. -1961.
# 14 1.54 304. 2. -1961.