I have a time series dataset "base" that looks like this:
date | var1 | var2 | var3 | var4 | ... |
---|---|---|---|---|---|
... | ... | ... | ... | ... | ... |
var1, var2, var3 and var4 are numeric. In realty the table has dozens of columns and rows.
My other dataset, "transform", looks like this:
variable | value |
---|---|
var1 | 2 |
var2 | 5 |
var3 | 1 |
var4 | 5 |
The "value" numbers go from 1 to 6.
In "base", I want to perform the transformation to take the difference of the logs only for variables that are associated to "5" in dataset "transform". How can I do this?
I tried:
transformation5 <- transform %>%
filter(value == 5) %>%
pull(variable)
base <- base %>%
mutate_at(vars(transformation5), ~ diff(log(.)))
But for some reason it considers all variables in "base". Thanks
1) dplyr There are several problems:
BOD
and transform
as shown.mutate_at
is not wrong but now across
is preferredfilter
acts on rows whereas we want to find which columns of transform
are 5, not rowsFirst find the desired columns , ix
, and then use across
.
library (dplyr)
transform <- data.frame(Time = 3, demand = 5)
ix <- which(transform == 5) # 2
BOD %>%
mutate(across(any_of(ix), ~ c(NA, diff(log(.x)))))
giving
Time demand
1 1 NA
2 2 0.21588838
3 3 0.61229508
4 4 -0.17185026
5 5 -0.02531781
6 7 0.23841102
2) Base R A base R version is the following. modifyList
replaces columns of the same name.
BOD |>
list(. = _) |>
with(modifyList(., lapply(.[ix], \(x) c(NA, diff(log(x))))))
3) collapse The collapse package can express this in a particularly compact way.
library(collapse)
nms <- names(transform)[ix]
tfmv(BOD, nms, Dlog)