I know this may be very simple but I can't solve it.
I have the following df:
Input Data
df<-data.frame(id=c(1,2,3,3,3,4, 4, 4, 4, 4, 4), value = c(956, 986, 995, 995, 986, 700, 600, 995, 956, 1000, 986))
Desired Result
df<-data.frame(id=c("1","2","3","3","3","4", "4", "4", "4", "4", "4"), value = c("956", "986", "995", "995", "986", "700", "600", "995", "956", "1000", "986"), median = c("956", "986","995","995", "995", "700","650","700","828", "956", "971"))
This is, to calculate the median value for each row (taking into account the different ID's). In each row a new value will be added and then a new median should be calculated
Output Data
library(dplyr)
w = df %>%
group_by(id) %>%
mutate(median = median(value, na.rm =TRUE)) %>%
select (median)
df$median <- w[,2]
df<-data.frame(id=c("1","2","3","3","3","4", "4", "4", "4", "4", "4"), value = c("956", "986", "995", "995", "986", "700", "600", "995", "956", "1000", "986"), median = c("956", "986","995","995", "995", "971","971","971","971", "971", "971"))
You can use zoo::rollapplyr
to calculate a rolling median:
library(tidyverse);
library(zoo);
df %>%
group_by(id) %>%
mutate(
median = rollapplyr(value, seq_along(value), median))
## A tibble: 11 x 3
## Groups: id [4]
# id value median
# <dbl> <dbl> <dbl>
# 1 1. 956. 956.
# 2 2. 986. 986.
# 3 3. 995. 995.
# 4 3. 995. 995.
# 5 3. 986. 995.
# 6 4. 700. 700.
# 7 4. 600. 650.
# 8 4. 995. 700.
# 9 4. 956. 828.
#10 4. 1000. 956.
#11 4. 986. 971.
df <- data.frame(
id = c(1,2,3,3,3,4, 4, 4, 4, 4, 4),
value = c(956, 986, 995, 995, 986, 700, 600, 995, 956, 1000, 986))