Search code examples
rjoingroup-byaggregate-functionssumifs

SUMIFS issue in R


I'm fairly new to R and have been trying to solve my problem for hours today and thought it's time to turn this over to an expert. Here's my problem:

I have a dataframe, acme_one , that looks like this (sample of 10s of thousands of rows):

enter image description here

I have another dataframe, acme_two, that has distinct values from columns (product_id and tag_id from acme_one) that looks like this:

enter image description here

What I want to do is add 1 more column to acme_two, totals, so the end result looks like this:

enter image description here

To populate the new column, totals, here's the logic of the calculation I've been trying to do:

SUM values in quantity column from dataframe acme_one where:

acme_one$product_id == acme.two$product_id AND acme_one$tag_id == acme_two$tag_id AND acme_one$true_false == 'TRUE' AND acme_one$in_out == 'in'

Can you help me with how to do this in R? Thank you so much in advance!


Solution

  • Using the dplyr package, you can do it this way:

    library(dplyr)
    
    acme_one_totals <- acme_one %>%
      group_by(product_id, tag_id) %>%
      filter(true_false == TRUE, in_out == 'in') %>%
      summarise(totals = sum(quantity), .groups = 'drop') 
    
    acme_two %>%
      left_join(acme_one_totals)