Search code examples
rconditional-statements

How to perform function only for variables that attend to a condition in other dataset?


I have a time series dataset "base" that looks like this:

date var1 var2 var3 var4 ...
... ... ... ... ... ...

var1, var2, var3 and var4 are numeric. In realty the table has dozens of columns and rows.

My other dataset, "transform", looks like this:

variable value
var1 2
var2 5
var3 1
var4 5

The "value" numbers go from 1 to 6.

In "base", I want to perform the transformation to take the difference of the logs only for variables that are associated to "5" in dataset "transform". How can I do this?

I tried:

transformation5 <- transform %>%
  filter(value == 5) %>%
  pull(variable)

base <- base %>%
  mutate_at(vars(transformation5),  ~ diff(log(.)))

But for some reason it considers all variables in "base". Thanks


Solution

  • 1) dplyr There are several problems:

    • questions to SO should include a reproducible example. We have done that for you this time using the built-in 6x2 data frame BOD and transform as shown.
    • mutate_at is not wrong but now across is preferred
    • filter acts on rows whereas we want to find which columns of transform are 5, not rows
    • the diff/log function should return a vector the same length as the number of rows so we have added an NA

    First find the desired columns , ix, and then use across.

    library (dplyr)
    
    transform <- data.frame(Time = 3, demand = 5)
    ix <- which(transform == 5) # 2
    
    BOD %>%
     mutate(across(any_of(ix), ~ c(NA, diff(log(.x)))))
    

    giving

      Time      demand
    1    1          NA
    2    2  0.21588838
    3    3  0.61229508
    4    4 -0.17185026
    5    5 -0.02531781
    6    7  0.23841102
    

    2) Base R A base R version is the following. modifyList replaces columns of the same name.

    BOD |>
     list(. = _) |>
     with(modifyList(., lapply(.[ix], \(x) c(NA, diff(log(x))))))
    

    3) collapse The collapse package can express this in a particularly compact way.

     library(collapse)
     nms <- names(transform)[ix]
     tfmv(BOD, nms, Dlog)