Search code examples
arraysexcelexcel-formulalevenshtein-distanceexcel-lambda

How can I get this LAMBDA function to accept an array as input?


I'm using this Lambda function to calculate the Levenshtein distance between two strings. In Excel it's called LEV.

=LAMBDA(a,b,[ii],[jj],[arr],
        LET(
            i,IF(ISOMITTED(ii),1,ii),
            j,IF(ISOMITTED(jj),1,jj),
            a_i,MID(a,i,1),
            b_j,MID(b,j,1),
            init_array,MAKEARRAY(
                    LEN(a)+1,
                    LEN(b)+1,
                    LAMBDA(r,c,IFS(r=1,c-1,c=1,r-1,TRUE,0))
                    ),
            cost,N(NOT(a_i=b_j)),
            this_arr,IF(ISOMITTED(arr),init_array,arr),
            option_a,INDEX(this_arr,i+1-1,j+1)+1,
            option_b,INDEX(this_arr,i+1,j+1-1)+1,
            option_c,INDEX(this_arr,i+1-1,j+1-1)+cost,
            new_val,MIN(option_a,option_b,option_c),
            overlay,MAKEARRAY(
                    LEN(a)+1,
                    LEN(b)+1,
                    LAMBDA(r,c,IF(AND(r=i+1,c=j+1),new_val,0))
                    ),
            new_arr,this_arr+overlay,
            new_i,IF(i=LEN(a),IF(j=LEN(b),i+1,1),i+1),
            new_j,IF(i<>LEN(a),j,IF(j=LEN(b),j+1,j+1)),
            is_end,AND(new_i>LEN(a),new_j>LEN(b)),
            IF(is_end,new_val,LEV(a,b,new_i,new_j,new_arr))
            )
)

It works fine when comparing two strings. For example, if I pass in

=lev("book","back")

it returns 2.

However, if c2 contains

book look

And I put in

=lev(textsplit(c2," "),"back")

It returns #N/A. I want it to return

2
3

To be honest, I don't know where to start in troubleshooting this function. Can anyone help?

@Ike's answer very helpfully answered the question, but unfortunately my question was a bit ill-stated.

I need to get the lev distance between every word in c2 and d2.

Suppose c2 is

red ball

And d2 is

yellow window

The desired output would be the levenstein distance between each word in c2 and d2.

There are 4 possible comparisons: red/yellow, red/bat, ball/yellow, and ball/bat.

Desired output:

5
4
5
6

Note that I only want comparison between arrays, not within them. I don't want the distance between red and ball, because they're part of the same array.


Solution

  • You can use this formula:

    =LET(a,C2,b,D2,
    aSplit,TEXTSPLIT(a,," "),
    DROP(REDUCE("",aSplit,LAMBDA(r,x,VSTACK(r,lev(x,b)))),1))
    

    enter image description here

    As @vbasic2008 said, BYROW doesn't work.

    Passing an array to the Lev-Lambda would need another step of recursion within the Lambda function.

    Using REDUCE like this keeps the Lev-Lambda untouched.

    **** EDIT due to comment ****

    =LET(a,C2,b,E2,
    aSplit,TEXTSPLIT(a," "),
    bSplit,TEXTSPLIT(b," "),
    r, DROP(REDUCE("",bSplit,LAMBDA(r_2,x_2,VSTACK(r_2,
    DROP(REDUCE("",aSplit,LAMBDA(r,x,HSTACK(r,lev(x,x_2)))),,1)))),1),
    HSTACK(VSTACK("",TRANSPOSE(bSplit)),VSTACK(aSplit,r))
    )
    

    enter image description here