Search code examples
excelcolorsexcel-formulawildcardformula

How do I use Excel's "=SUMPRODUCT(--ISNUMBER(SEARCH..." function to exclude 'complicated compounded' results?


I am trying to validate a list of product descriptions to highlight any instances where the colour of the product has been used in its description.

I have a list of colours...

Example:
Blue
Red
Green
Yellow
Purple
etc.

I am using the following formula to detect if one of these colours has been used:

=IF(SUMPRODUCT(--ISNUMBER(SEARCH(Table1[Colors],A1)))>0,"Cannot include a colour","")

Where Table1[Colours] contains my list of colour text strings and A1 contains my first product description.

(Dave Bruns @ ExcelJet has a great read for anyone wanting to use SUMPRODUCT/ISNUMBER/SEARCH combinations.)

If a product description contains a colour specified in my Table1[Colours] list the formula produces "Cannot include a colour" to remind the user this is not allowed.

Example: "Garmin Forerunner 10 Running Sportswatch Green"

The first issue i am faced with is that my current formula procs when the product description contains a compound word containing a colour...

Example: "Blackberry Z10 Smartphone"

This inaccurately invalidates the description because the string "Black" in this text is not being used to describe the colour of the product.

As the title suggests, my main issue lies with outsmarting 'complicated compound' words...

... my Table1[Colours] list does not simply contain the basic Primary, Secondary and Tertiary colours, but also more exotic ones like Coral, Fuchsia and Tan.

This causes complication when the product description contains a word like "Stand".

Why is this a problem you may be thinking? Stand contains one of my exotic colours 'Tan'

S-Tan-d

Unfortunately this also causes my formula to proc. (Annoying right?)

The solution I am looking for is an addition to my existing formula =IF(SUMPRODUCT(--ISNUMBER(SEARCH(Table1[Colors],A1)))>0,"Cannot include a colour","") which accounts for the possible occurrence of a 'complicated compound' be this by a counter list of acceptable words (e.g. Table2[Exceptions] or by wild carding the search to match the exact colour with no Prefix or Suffix (this option would have to allow for the possibility of a dual colour separated by a / e.g. "Black/Red", so wild carding with certain punctuation exceptions?)

...Its all just a bit horrible and inconvenient.

Any advice is appreciated.

Thanks, Mr. J


Solution

  • You need to search for the word boundaries. If you add a space to the beginning and end of the color, and also to the beginning and end of your description, that should do the trick depending on your data. So the search part of your formula could read:

    SEARCH(" " &Table1[Colors]& " "," "&A1&" ")
    

    Or, for your entire formula:

    =IF(SUMPRODUCT(--ISNUMBER(SEARCH(" "&Table1[Colors]&" "," "&A1&" ")))>0,"Cannot include a colour","")
    

    If you have hyphenated colors, e.g: blue-green, or a color like Cherry3, you would need to list them separately in your table.

    EDIT: As your comment suggests a much more complex situation, I would suggest a User Defined Function (UDF) for ease of maintenance.

    The following UDF can accept, as findtext a range, a single string, or an array constant consisting of several strings. If you use an array constant, you must use the semicolon ; and not a comma as the separator.

    A usage example:

    =IF(reMatch(Table1[Colors],A1),"Cannot include a colour","")
    

    The code uses the Regex token for Word Boundary. A word boundary is the point at which a character in the set [0-9A-Za-z_] is adjacent to any character not in that set, or next to the beginning or end of the string. That should cover all of your IF function examples, and more.

    Option Explicit
    Function reMatch(FindText As Variant, WithinText As String) As Boolean
    'FindText could be a Range, an array constant, or a single item
    'If FindText is an array constant, separate elements by semicolons, not commas
        Dim RE As Object
        Dim I As Long
        Dim C As Range
        Dim vFind As Variant
    
    reMatch = False
    Set RE = CreateObject("vbscript.regexp")
    With RE
        .Global = True
        .IgnoreCase = True
    
        vFind = FindText  'will create array if FindText is a range
        If IsArray(vFind) Then
            For I = 1 To UBound(vFind)
                .Pattern = "\b" & vFind(I, 1) & "\b"
                If .Test(WithinText) = True Then
                    reMatch = True
                    Exit Function
                End If
            Next I
        Else
            .Pattern = "\b" & vFind & "\b"
            If .Test(WithinText) = True Then _
                reMatch = True
        End If
    End With
    
    End Function
    

    EDIT: As written, FindText can be a range of cells; however, that range must be a single column vertical range. If it is a horizontal range, the function will return the #VALUE! error. If necessary, the UDF could be modified to handle that by testing vFind and ensuring it is a 2D array. This would also enable the use of array constants with comma separators (the additional code is that which is seen between the first and last lines of code below.

    ...
    vFind = FindText  'will create array if FindText is a range
    
    'make sure vFind is 2D (if array)
    On Error Resume Next
        J = UBound(vFind, 2)
        If Err.Number <> 0 Then vFind = WorksheetFunction.Transpose(vFind)
    On Error GoTo 0
    
    If IsArray(vFind) Then
    ...