I would like to append dfToAdd to df, where the first has missing columns. Important detail is that df has 2 types of columns. 1st set of columns are correlating with each other. e.g. group="A" means name="Group A" and color="Blue". There can't be a combination of A-Group A-Red. 2nd type of columns are correlating among themselves. animal="Dog" action="Bark" And I would like to add this second data frame with missing columns of the first type of columns. Those columns should be filled with combinations of the first type of columns like the following dfResult (order of rows don't matter):
df = data.frame(group = c("A", "A", "A", "B", "B", "B"),
name = c("Group A", "Group A", "Group A", "Group B", "Group B", "Group B"),
color = c("Blue", "Blue", "Blue", "Red", "Red", "Red"),
animal = c("Dog", "Cat", "Mouse", "Dog", "Cat", "Mouse"),
action = c("Bark", "Meow", "Squeak", "Bark", "Meow", "Squeak")
)
dfToAdd = data.frame(animal = c("Lion", "Bird"),
action = c("Roar", "Chirp"))
dfResult = data.frame(group = c("A", "A", "A", "B", "B", "B", "A", "A", "B", "B"),
name = c("Group A", "Group A", "Group A", "Group B", "Group B", "Group B", "Group A", "Group A", "Group B", "Group B"),
color = c("Blue", "Blue", "Blue", "Red", "Red", "Red", "Blue", "Blue", "Red", "Red"),
animal = c("Dog", "Cat", "Mouse", "Dog", "Cat", "Mouse", "Lion", "Bird", "Lion", "Bird"),
action = c("Bark", "Meow", "Squeak", "Bark", "Meow", "Squeak", "Roar", "Chirp", "Roar", "Chirp"))
> df
group name color animal action
1 A Group A Blue Dog Bark
2 A Group A Blue Cat Meow
3 A Group A Blue Mouse Squeak
4 B Group B Red Dog Bark
5 B Group B Red Cat Meow
6 B Group B Red Mouse Squeak
> dfToAdd
animal action
1 Lion Roar
2 Bird Chirp
> dfResult
group name color animal action
1 A Group A Blue Dog Bark
2 A Group A Blue Cat Meow
3 A Group A Blue Mouse Squeak
4 B Group B Red Dog Bark
5 B Group B Red Cat Meow
6 B Group B Red Mouse Squeak
7 A Group A Blue Lion Roar
8 A Group A Blue Bird Chirp
9 B Group B Red Lion Roar
10 B Group B Red Bird Chirp
But the 1st type of columns (group, name, color) is not completely known. I am working with multiple grouping variables of an arbitrary number. You can imagine that there may or may not be be a description column="Group A is a good group" or date="2020.04.13". We only know for sure the columns of the second type: animal and action.
We could do this in a single %>%
by slice
ing the first row from 'df', select
the columns that are not the ones in 'dfToAdd', bind that with the 'dfToAdd', then do the row bind with 'df' and use complete
library(dplyr)
library(tidyr)
library(rlang)
library(purrr)
df %>%
slice(1) %>%
select(-names(dfToAdd)) %>%
uncount(nrow(dfToAdd)) %>%
bind_cols(dfToAdd) %>%
bind_rows(df, .) %>%
complete(nesting(!!! syms(names(dfToAdd))),
nesting(!!! syms(setdiff(names(.), names(dfToAdd)))))
# A tibble: 10 x 5
# animal action group name color
# * <fct> <fct> <fct> <fct> <fct>
# 1 Cat Meow A Group A Blue
# 2 Cat Meow B Group B Red
# 3 Dog Bark A Group A Blue
# 4 Dog Bark B Group B Red
# 5 Mouse Squeak A Group A Blue
# 6 Mouse Squeak B Group B Red
# 7 Bird Chirp A Group A Blue
# 8 Bird Chirp B Group B Red
# 9 Lion Roar A Group A Blue
#10 Lion Roar B Group B Red