Search code examples
google-sheetsgoogle-sheets-formula

How to highlight repeated data in alternating columns?


Exemple: https://docs.google.com/spreadsheets/d/1o6088xXfnEYn-RflrPvY3e_OGJyrI3-tMIb3tFbm6e8/edit?usp=sharing

I have a question about google sheets, columns id 1 and id 2 have codes that together result in a unique code like column unique id. But, occurs a repetition between columns id 1 and id 2 and I would like to know how to highlight that repetition where the unique id is return. Like in row 3 and 8.


I try something like this:

=QUERY(B3:D10;"select B where "JOIN("",C,D)"="JOIN("",D,C)" ")

=FILTER(C3:D11;CONCAT(C3:D3)=CONCAT(C3:D3))

The expected result is something like line 15 of the example.


Solution

  • To find rows where both IDs are duplicates of IDs found in another row, use filter(), like this:

    =let( 
      data; B3:D11; id_1; C3:C11; id_2; D3:D11; 
      compoundIds; 
        map( 
          id_1; id_2; 
          lambda( 
            c; d; 
            if(c < d;  c & "→" & d; d & "→" & c) 
          ) 
        ); 
      filter( 
        data; 
        countif(compoundIds; compoundIds) > 1 
      ) 
    )
    

    If you do not have the recently introduced let() function yet, use lambda() instead:

    =lambda( 
      data; id_1; id_2; 
      lambda( 
        compoundIds; 
        filter( 
          data; 
          countif(compoundIds; compoundIds) > 1 
        ) 
      )( 
        map( 
          id_1; id_2; 
          lambda( 
            c; d; 
            if(c < d;  c & "→" & d; d & "→" & c) 
          ) 
        ) 
      ) 
    )( 
      B3:D11; C3:C11; D3:D11 
    )
    

    To highlight duplicates in the list, use this conditional formatting custom formula rule for the range C3:D11:

    =len($F3:$F11)
    

    ...and fill column F with this formula in cell F3:

    =let( 
      ids; B3:B11; id_1; C3:C11; id_2; D3:D11; 
      compoundIds; 
        map( 
          id_1; id_2; 
          lambda( 
            c; d; 
            if(c < d;  c & "→" & d; d & "→" & c) 
          ) 
        ); 
      map( 
        ids; compoundIds; 
        lambda( 
          id; compoundId; 
          if( 
            countif(compoundIds; compoundId) > 1; 
            id & " = " & 
            join( ", "; filter(ids; row(ids) <> row(id); compoundIds = compoundId) ); 
            iferror(1/0) 
          ) 
        ) 
      ) 
    )
    

    ...or use the equivalent lambda:

    =lambda( 
      ids; id_1; id_2; 
      lambda( 
        compoundIds; 
        map( 
          ids; compoundIds; 
          lambda( 
            id; compoundId; 
            if( 
              countif(compoundIds; compoundId) > 1; 
              id & " = " & 
              join( ", "; filter(ids; row(ids) <> row(id); compoundIds = compoundId) ); 
              iferror(1/0) 
            ) 
          ) 
        ) 
      )( 
        map( 
          id_1; id_2; 
          lambda( 
            c; d; 
            if(c < d;  c & "→" & d; d & "→" & c) 
          ) 
        ) 
      ) 
    )( 
      B3:B11; C3:C11; D3:D11 
    )
    

    To find rows where id_1 is a duplicate of an id_2 found in the other column in another row, use:

    =filter( 
      B3:C11; 
      countif(D3:D11; C3:C11) 
    )
    

    To get the IDs of the duplicates, use vlookup(), like this:

    =arrayformula( iferror( vlookup(C18:C; { D3:D11 \ B3:B11 }; 2; false) ) )
    

    To highlight duplicates in the list, use this conditional formatting custom formula rule for the range C3:D11:

    =countif($C$3:$D$11; C3) > 1
    

    See your sample spreadsheet.

    As shown above, in the event your spreadsheet locale uses comma , as decimal separator, you will have to use semicolon ; as formula argument separator and backslash '\' as { array expression } horizontal separator. If your spreadsheet uses period . as decimal mark, use comma , for both those purposes.