Search code examples
rjoinmerge

Need to join two dataframes in R and combine data


I'm having this issue: I have to different dataframes for two bird count points from different days. I need to combine the dataframes and also, the names of the columns of the different dataframes do not match. My data frames are like this:

df1

Especies Conteo
(Crypturellus tataupa) 1
(Piaya cayana) 2

df2

Species Count
(Crypturellus tataupa) 3
(Celeus flavescens) 1

I have tried the whole set of merge, and was not what I needed.

And I need two different outputs:

-a complete merged where I end up having a full count of every specie

Species Count
(Crypturellus tataupa) 4
(Piaya cayana) 2
(Celeus flavescens) 1

-a merged of the species merged and different columns for each count

Especies Count1 Count2
(Crypturellus tataupa) 1 3
(Piaya cayana) 2 0
(Celeus flavescens) 0 1

As you can probably tell, I don't have a lot of experience in R. Thanks in advance.


Solution

  • You can try the code below to obtain out1 and out2, respectively

    # initial merged output
    out <- merge(
      df1,
      df2,
      by.x = "Especies",
      by.y = "Species",
      all = TRUE
    )
    
    # first output
    out1 <- transform(
      out,
      Count = rowSums(out[-1], na.rm = TRUE)
    )[-2]
    
    # second output
    out2 <- cbind(
      out[1],
      setNames(
        replace(out[-1], is.na(out[-1]), 0),
        paste0("Count", seq_along(out[-1]))
      )
    )
    

    where

    > out1
                    Especies Count
    1    (Celeus flavescens)     1
    2 (Crypturellus tataupa)     4
    3         (Piaya cayana)     2
    
    > out2
                    Especies Count1 Count2
    1    (Celeus flavescens)      0      1
    2 (Crypturellus tataupa)      1      3
    3         (Piaya cayana)      2      0
    

    data

    > dput(df1)
    structure(list(Especies = c("(Crypturellus tataupa)", "(Piaya cayana)"
    ), Conteo = 1:2), class = "data.frame", row.names = c(NA, -2L
    ))
    
    > dput(df2)
    structure(list(Species = c("(Crypturellus tataupa)", "(Celeus flavescens)"
    ), Count = c(3L, 1L)), class = "data.frame", row.names = c(NA,
    -2L))