I am currently interpolating a time-series and need to use the approx
function in a dataframe
with 4 columns and 172660 rows, but 4 groups (so its 43165 rows for each group). Currently, there's two answers about this: using summarise, but with the interpolation in just one column; and one using a datatable. The first approach indeed works, but not for my purpose. I also noted that using mutate_at, for example, is superseeded by mutate(across())
. So I was trying to use a more up-to-date approach, but it's not working.
library(tidyverse)
tabela_1 <- tibble(x1 = rnorm(4800, mean = 88.5, sd = 4),
x2 = rnorm(4800, mean = -38.526, sd = 2.758),
x3 = rnorm(4800, mean = -22.6852, sd = 1.8652),
x4 = rnorm(4800, mean = -38.526, sd = 2.758),
tmpts = rep(x = seq(from = 0, to = 863.28, by = 0.72),
times = 4),
category = rep(x = 1:4, each = 1200))
tabela <- tibble(tmpts = rep(x = seq(from = 0, to = 863.28, by = 0.02),
times = 4),
category = rep(x = 1:4, each = 43165))
tabela_joined <- tabela %>%
left_join(tabela_1, by = c("tmpts", "category")) %>%
arrange(category, tmpts) %>%
janitor::clean_names()
tabela_interpolation <- tabela_joined %>%
group_by(category) %>%
summarize(across(.cols = x1:x4, approx(., n = 43165)))
When running tabela_interpolation
, I receive:
Erro: Problem with `summarise()` input `..1`.
i `..1 = across(.cols = x1:x15, approx(., n = 43165))`.
x Can't convert an integer vector to function
i The error occurred in group 1: run = 1.
Run `rlang::last_error()` to see where the error occurred.
Além disso: Warning message:
In regularize.values(x, y, ties, missing(ties), na.rm = na.rm) :
collapsing to unique 'x' values
How should I use summarise
plus across
to get the interpolated time-series from approx
function in each column in the dataframe
?
You can use the across
syntax as -
library(tidyverse)
tabela_joined %>%
group_by(category) %>%
summarize(across(x1:x4, approx, n = 43165)) %>%
ungroup
Or
tabela_joined %>%
group_by(category) %>%
summarize(across(x1:x4, ~approx(., n = 43165))) %>%
ungroup
This can be followed by unnest
to get the complete expanded dataframe.
tabela_joined %>%
group_by(category) %>%
summarize(across(x1:x4, approx, n = 43165)) %>%
ungroup %>%
unnest(x1:x4)
# category x1 x2 x3 x4
# <int> <dbl> <dbl> <dbl> <dbl>
# 1 1 1 1 1 1
# 2 1 2 2 2 2
# 3 1 3 3 3 3
# 4 1 4 4 4 4
# 5 1 5 5 5 5
# 6 1 6 6 6 6
# 7 1 7 7 7 7
# 8 1 8 8 8 8
# 9 1 9 9 9 9
#10 1 10 10 10 10
# … with 345,310 more rows