Imagine I have several data frames from the same cross-national survey project. They ask the same questions over time but they entered the variables with different column names in each data frame. A variable (X) has column name v2
in one wave, V34
in another, etc.
df_w1 <- data.frame(v2 = c(3,4,6))
df_w2 <- data.frame(V34 = c(2,7,9))
I would like to select these variables, and append them in a new data frame. I can individually rename columns in each data frame, select the variables, and then append. But I would like to ask if there is a way to write a function, say, with a list of column names for a specific variable, which could go over the data frames, and then select, rename, and append values for that variable from different data frames.
For instance, the function would iterate through the list(df_w1, df_w2)
, check the variable names, identify v2
and V34
, rename them (X), and append them in a new data frame. And I need to run this function for different variables that have different column names across data frames.
Edit: Here is how the final data could look like:
wave X
w1 3
w1 4
w1 6
w2 2
w2 7
w2 9
If you keep a list same_cols
of the columns that are equivalent across data frames, you can iterate over that list, and over the data frames, using map
and rename_with
to get the names aligned.
library(tidyverse)
n <- 5
df_w1 <- data.frame(v1 = rnorm(n),
v2 = rnorm(n),
v3 = rnorm(n))
df_w2 <- data.frame(V23 = rnorm(n),
V34 = rnorm(n),
V99 = rnorm(n))
same_cols <- list(x1 = c("v2", "V34"),
x2 = c("v3", "V99"))
same_cols |>
imap(\(col_set, new_colname) {
list(df_w1, df_w2) |>
set_names(c('w1', 'w2')) |>
imap(\(df, wave) df |>
select(any_of(col_set)) |>
rename_with(\(col) new_colname) |>
mutate(wave_idx = wave)) |>
list_rbind()
})
Output
$x1
x1 wave_idx
1 0.4387218 w1
2 -0.6791123 w1
3 -1.7029648 w1
4 1.2651684 w1
5 0.3603572 w1
6 -0.7365782 w2
7 0.6224098 w2
8 -0.2907160 w2
9 -0.2142115 w2
10 -0.1125596 w2
$x2
x2 wave_idx
1 -0.5836394 w1
2 -1.9940788 w1
3 1.9022098 w1
4 3.3903708 w1
5 0.2074804 w1
6 -1.8636670 w2
7 0.8376299 w2
8 -1.4434929 w2
9 -0.2085702 w2
10 -0.4385635 w2