In dplyr/across, below code show error "...defined for equally-sized data frames". This caused by no profit_2023
to match with existing variable profit_2022
.How to handle it ?(I want the no match the result and return NA or blank)
library(tidyverse)
raw_data_2 <- data.frame(category=c('A','B','C'),
sales_2022=c(1,2,3),
profit_2022=c(0.1,0.4,0.3),
margin_2022=c(0.2,0.8,0.6),
sales_2023=c(1.5,3,4),
margin_2023=c(0.3,0.7,0.5)
)
raw_data_2 %>% mutate((across(contains("_2023"),.names="{col}_increase") / across(contains("_2022"))-1))
The main issue with your data is that it's quite untidy. Clean the data, and it becomes a lot more straightforward:
raw_data_2 |>
pivot_longer(cols = -category, names_to = c(".value", "year"), names_sep = "_") |>
mutate(across(c(sales, profit, margin), \(x) x/lag(x)-1, .names = "{.col}_increase"), .by = category)
Output:
# A tibble: 6 × 8
category year sales profit margin sales_increase profit_increase
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 2022 1 0.1 0.2 NA NA
2 A 2023 1.5 NA 0.3 0.5 NA
3 B 2022 2 0.4 0.8 NA NA
4 B 2023 3 NA 0.7 0.5 NA
5 C 2022 3 0.3 0.6 NA NA
6 C 2023 4 NA 0.5 0.333 NA
# ℹ 1 more variable: margin_increase <dbl>
Notes: