Search code examples
rdataframejoinmergebind

R: Merging two dataframes without multiplying rows


I have two dataframes which I would like to merge:

df1:

Day   name 
1     A   
1     B     
1     C     
2     A   
2     B      
2     C    
3     A      
3     B   

df2:

Day   spending
2     25
2     30
3     20
3     15
3     10

Desired outcome:

Day   name   spending
1     A      NA
1     B      NA
1     C      NA
2     A      25   
2     B      30
2     C      NA
3     A      20
3     B      15
3     NA     10 

I have tried merge & join_all, but all I got was:

Day   name   spending
1     A      NA
1     B      NA
1     C      NA
2     A      25 
2     A      30
2     B      25      
2     B      30
2     C      NA
3     A      20
3     A      15
3     A      10
3     B      20
3     B      15
3     B      10
3     NA     20
3     NA     15
3     NA     10

Note that I wish to join the two dataframes by day without having the length of rows multiplying at all, despite missing data in some columns (such as day 1 data in df2 & missing name of C in df1). Please advise


Solution

  • That's not possible without adding a time variable.

    library(dplyr) #
    
    full_join(
      df1 %>% group_by(Day) %>% mutate(time=row_number()),
      df2 %>% group_by(Day) %>% mutate(time=row_number()), 
    by=c("Day", "time"))
    

    # A tibble: 9 x 4
    # Groups:   Day [3]
        Day name   time spending
      <int> <fct> <int>    <int>
    1     1 A         1       NA
    2     1 B         2       NA
    3     1 C         3       NA
    4     2 A         1       25
    5     2 B         2       30
    6     2 C         3       NA
    7     3 A         1       20
    8     3 B         2       15
    9     3 NA        3       10