Search code examples
google-sheetsduplicatesgoogle-sheets-formula

How can I compound data for sequential duplicates while ignoring the first of those duplicates?


I'm using Google Sheets and I'm having trouble trying to build a formula to make it so that 7 days is added to a date for every duplicate, but also ignoring the first instance, so for example:

A            B                                           C           D
APPLES       3/10/23                                     APPLES      3/10/23
BANANA       6/9/23                                      APPLES      3/17/23
ORANGES      9/5/23                                      BANANA      6/9/23
TURNIPS                                                  ORANGES     9/5/23
                                                         TURNIPS             
                                                         APPLES      3/24/23
                                                         BANANA      6/16/23

*Updated for dates that are currently empty where the formula needs to check for blanks and D will be auto-filled in once the date in column B is entered

Where A and B are the original data and the cells in column D would have the formula im trying to build to grab from column B according to if C matches A and add 7 compounding days for every duplicate.

I've tried using combinations of if statements, filters, vlookups, and countifs which work for grabbing the starting values (C1/D1, C3/D3 and so on), but the problem I keep running into is ignoring the first instance while also compounding every duplicate.


Solution

  • UPDATED ANSWER BASED ON ADDITIONAL REQUIREMENTS

    Try the following in D1 (ensuring that you have formatted the cells in column D as a date):

    =arrayformula(let(
    inputcats,tocol(A:A,1),
    inputdates,B1:index(B:B,counta(inputcats)),
    outputcats,tocol(C:C,1),
    dateseq,xlookup(outputcats,inputcats,inputdates)+7*(countifs(outputcats,outputcats,sequence(counta(outputcats)),"<="&sequence(counta(outputcats)))-1),
    if(dateseq>0,to_date(dateseq),)))
    

    As per the previous version of my answer, the COUNTIFS gets the running count of each value in column C. Subtract 1 from each value so the running counts start at zero. Multiply by 7 so that the running count for each value in C effectively becomes 0,7,14... All you need to do then is use XLOOKUP to get the 'start' date for each value from columns A&B and add this to the running count we have created using COUNTIFS. To deal with missing dates I've added additional logic to only output a date if the datevalue generated is >0 (otherwise 12/30/1899 is generated from result cells with datevalue 0).

    I've also used LET variables this time to make the formula generic for any numbers of input categories and dates or output categories.