Search code examples
excelexcel-formulavlookup

How to use an Excel formula to obtain the best match based on the wording within the same cell in Excel?


I need to find the exact matching case between two Excel data tables. While working on it, I have searched through most of the websites. Exact Match 001 During this process, I have tried various functions such as VLOOKUP, HLOOKUP, XLOOKUP, Index Match. However, in all cases, I couldn't compare two separate words within the same cell.

Next, I attempted to separate the words using the TEXTSPLIT function and tried to count them using the Search and COUNTIF fucntion. Unfortunately, I didn't find a proper solution for this.

I require expert support to write a formula that retrieves the best case number from Table 01, using the data from Table 02.

For more clarity, please refer to the link of the Google Sheet as well.

Best regard Indika

Table 01

Table 01

Table 02

Table 02

Expecting Solution

Solution


Solution

  • This is quite intricate, but here are my two cents to find the best match using Excel ms365:

    enter image description here

    Formula in B3:

    =MAP(A3:A7,LAMBDA(s,REDUCE("",D3:D6,LAMBDA(x,y,LET(fit,MMULT(--ISNUMBER(FIND(" "&TEXTSPLIT(TEXTAFTER(" "&A3:A7," ",SEQUENCE(,99))," ")&" "," "&y&" ")),SEQUENCE(99,,,0)),IF(ROW(s)-2=XMATCH(MAX(fit),fit),LET(z,VLOOKUP(y,D3:E6,2,0),IF(x="",z,IF(COUNTA(TEXTSPLIT(y," "))<COUNTA(TEXTSPLIT(XLOOKUP(x,E3:E6,D3:D6)," ")),z,x))),x))))))
    

    The idea here is that we iterate all values under 'Data 01' using MAP(). Each iteration, using the 's' variable, we have a nested REDUCE(). The idea behind this nested function is to find out if 's' is the best fit for one of the values under 'Data02'.

    To find out what is the best fit I applied two things:

    • I checked how many substrings (split by space) each value under 'Data01' actually has compared to each value under 'Data02'. This is done by the MMULT() function. If the current row indeed holds the most substrings available in any of the values under 'Data02' then take the number;
    • REDUCE() will loop all the values under 'Data02' to test the above for all strings. If it so happened that 's' is the best fit for multiple values under 'Data02' then this is now able to check if the next value is made up of less substrings. My assumption here: fewer substrings (split by space) == a better fit!.