Search code examples
google-sheetsgoogle-sheets-formulaarray-formulas

Find first wildcard/expression that matches given value


What I need

I have two Google Sheets tables. I want to find the first cell of table 1 (which contain wildcards) that matches against my value in table 2. Then, take the value in the next column of the matching expression-cell and put it in the second column of table 2.

Example

As an example, given the table below:

Example Table

Here is the CSV data:

Job Wildcards,Salary,,Job Description,Salary
*Dev*,40000,,Junior Developer,40000
*Engineer*,50000,,Senior Developer,40000
*Manager*,60000,,Platform Engineer,50000
,,,Project Manager,60000
,,,Team Manager,60000

The salaries in column E shall be auto-populated by looking for the first 'Job Wildcard' in the first table that matches the 'Job Description' and saving it's salary. I've provided the correct values manually.


What I've tried

First I've thought about vlookup:

=VLOOKUP(D2,A$2:$B5,2,FALSE)

But the error message confirms this does not work: Did not find value 'Junior Developer' in VLOOKUP evaluation.. Because I don't want to find 'Junior Developer', I wanna know what matches the other way around.

Then I've tried using index and match like this:

=INDEX(B$2:B$6,MATCH(D2,A$2:A$7,0))

But this returns a #N/A with kinda the same error as vlookup: Did not find value 'Junior Developer' in MATCH evaluation. which, again, makes sense. I still need to match column A against D and not D against A. But how do I do that?

Then I tried to use REGEXMATCH. For that, I've changed the wildcards to regular expressions (making * to .*):

=INDEX(FILTER(REGEXMATCH(A2,ARRAYFORMULA(D2:D7)), NOT(FALSE)), 1)

But here, I get the error FILTER has mismatched range sizes. Expected row count: 7. column count: 1. Actual row count: 1, column count: 1.. And at this point, I admit defeat at this end.

I also replaced * with % and tried QUERY:

=QUERY($A$2:$D$6,"SELECT B where D like A")

But this also does not have the effect I was hoping for.

I think I'm close, but I would really appreciate a nudge into the right direction.


Solution

  • @ztiaa's and @rockinfreakshow's solutions were incorrect at like 5-20% of the time, depending on the regex-column sorting. As these formulas are quite complicated, I had no idea how to debug them.

    I've found two other solutions.

    Solution 1: App Scripts function

    Go to Extensions->Apps Script and add the following custom function:

    /**
     * Find the first cell of a range that matches against the given text and return a return value. 
     * @param {string} input The text to search for.
     * @param {range} input Two colums. The first must contain regular expressions to match against. The second must contain the corresponding return values.
     * @return The return value of the first regex match in the range that matches the given text. Returns 'NO_MATCH' on no match.
     * @customfunction
    */
    function matchTextToRegexArray(cellText, regexResultsArray) {
      for (var i = 0; i < regexResultsArray.length; i++) {
        var regex = new RegExp(regexResultsArray[i][0], "i");
        if (regex.test(cellText)) {
          return regexResultsArray[i][1];
        }
      }
      return "NO_MATCH";
    }
    

    Use it in the table as =matchTextToRegexArray(D2,A$2:B).

    This one is, for me, easy to understand. But it can be quite slow. Faster is solution 2

    Solution 2: QUERY

    I've found the correct query:

    =QUERY(A$3:$B$99, "select B where '"&lower(D2)&"' matches LOWER(A) limit 1")
    

    Where:

    • D2 is the search term
    • Column A contains regular expressions (you might need .* at the beginning and end of all terms in this column)
    • Column B contains the value you want to save alongside your search term

    This is fast, easy and does not require any extras.

    Thanks never the less!