Search code examples
excelexcel-2010conditional-formattingcountif

Identify Duplicates or COUNTIF > 255 characters (long text)


I have a strange problem, I have an excel document with two columns, col A contains the key, this is not unique, col b contains text of varying length, usually above 255 characters.

I'd like to apply a filter to column A to select two keys, this is fine, I'd then like to identity any duplicates that exists within column B - applying only to fields that are visible, i.e. as a result of the filter on column A.

This should be simple; however, the conditional formatting is failing to identify duplicates, I'm guessing because it does not like the length of the text.

Even trying this the mandrolic way fails as the search box appears to only be able to accept a search string of a certain length.


Solution

  • The COUNTIF function has a string criteria limit of 255 characters.

    From support.office.com
    Wrong value returned for long strings.      The COUNTIF function returns incorrect results when you use it to match strings longer than 255 characters.

    There is a work-around solution offered by support.office.com on the COUNTIF support page but I couldn't get it working so I wrote a User defined Function that does work and added hidden/visible and case sensitive options.

    COUNTIFSBIGTXT function - CountIfs functionality for criteria strings longer than 255 characters

    Put this in a public module code sheet (alt+F11, Insert, Module).

    Option Explicit
    
    Function COUNTIFSBIGTXT(iOptions As Long, ParamArray pairs()) As Long
        'COUNTIFSBIGTXT - CountIfs functionality for criteria strings longer than 255 characters
        ' https://stackoverflow.com/questions/51688846#51689459
        '
        ' =COUNTIFSBIGTXT(<options>, <string_range1>, <criteria1>, [string_range2], [criteria2], …)
        '        OPTIONS
        '      0 No options
        '     +1 Include hidden cells in <string_range1>, [string_range2], etc
        '     +2 Case sensitive comparison
    
        'throw error if string_range and criteria do not come in pairs
        If Not CBool(UBound(pairs) Mod 2) Then
            COUNTIFSBIGTXT = CVErr(xlErrValue)
            Exit Function
        End If
    
        'declare variables
        Dim i As Long, j As Long
        Dim bIncludeHidden As Boolean, bCaseSensitive As Boolean
    
        'set optional booleans
        bIncludeHidden = CBool(1 And iOptions)
        bCaseSensitive = CBool(2 And iOptions)
    
        'restrict full column references to the parent worksheet's UsedRange
        Set pairs(LBound(pairs)) = Intersect(pairs(LBound(pairs)), pairs(LBound(pairs)).Parent.UsedRange)
    
        'resize all <string_range> to the same dimensions
        With pairs(LBound(pairs))
            For i = LBound(pairs) + 2 To UBound(pairs) Step 2
                Set pairs(i) = pairs(i).Resize(.Rows.Count, .Columns.Count)
                'Debug.Print pairs(i).Address(0, 0)
            Next i
        End With
    
        'loop cell count in pairs(LBound(pairs)) for relative ordinal
        For i = 1 To pairs(LBound(pairs)).Cells.Count
            'loop through each pair of <string_range> and <criteria>
            For j = LBound(pairs) To UBound(pairs) Step 2
                'exit for if any argument pair does not meet criteria
                With pairs(j).Cells(i)
                    'throw out worksheet error codes
                    If IsError(.Value) Then Exit For
                    'do the pair(s) meet a case insensitive match
                    If LCase(.Value2) <> LCase(pairs(j + 1)) Then Exit For
                    'do the pair(s) meet a case sensitive match with option
                    If .Value2 <> pairs(j + 1) And LCase(.Value2) = LCase(pairs(j + 1)) And bCaseSensitive Then Exit For
                    'are the cells visible or hidden with include option
                    If (.EntireRow.Hidden Or .EntireColumn.Hidden) And Not bIncludeHidden Then Exit For
                End With
            Next j
    
            'determine if all argument pairs matched
            If j > UBound(pairs) Then _
                COUNTIFSBIGTXT = COUNTIFSBIGTXT + 1
        Next i
    
    End Function
    

    Syntax:

    =COUNTIFSBIGTXT(<options>, <string_range1>, <criteria1>, [optional string_range2], [optional criteria2], …)
    

    Documentation

    enter image description here

    Sample data²

    The following examples were based on 9 rows of an identical 600 character string with three rows forced to upper-case.

    enter image description here

    Example 1

    Simple COUNTIF operation discarding filtered/hidden rows and not case sensitive.

    =COUNTIFSBIGTXT(0, B:B, B2)
    

    enter image description here

    Example 2

    Expanded COUNTIFS¹ operation discarding filtered/hidden rows but case sensitive string comparison. Secondary criteria with A:A equals the value in A2.

    =COUNTIFSBIGTXT(2, B:B, B2, A:A, A2)
    

    enter image description here


    ¹ The logic flow 'short-circuits' on fail. With multiple string/criteria pairs, you can enhance performance
        by listing the least likely string_range/criteria matches first. You might see similar increases in calculation
        efficiency for special circumstances by reordering the latter three 'kick-out' Exit For statements for your
        particular requirements but I recommend keeping the check for worksheet errors as the primary check.
        For example, if you have a lot of potential string matches but very few visible rows, moving the check
        for visible cells above the check for string matches would cut down on the criteria checks.

    ² Many thanks to the Lorem Ipsum Generator for the sample string content.