Search code examples
sqlgoogle-apps-scriptgoogle-sheetslevenshtein-distance

Levenshtein Distance/Custum Function in Google Sheets Query


I added a function levenshteinDistanceDP(string1,string2) as a custom function in AppScript. It returns an Integer. Additionally I have the Query in a sheet:

QUERY(Expenses!A1:H;"SELECT A, B, C, D WHERE LOWER(H) contains LOWER("""&K3&""") ";1)

Complementary to "WHERE LOWER(H) contains LOWER("""&K3&""")" i want to include results where levenshteinDistanceDP(lower(H),Lower(K3)) is smaller than lets say 3.

I guess there would be a way, by adding a column to "Expenses" which uses the custom function, and the query just checks for that column. If possible I would like to do it within the query tho. Any ideas?

Edit: The idea with the extra column is especially unsuitable, since the custom function takes around a second to load in each row. For several thousand rows unfeasible, since K3 can be updated at the will of the user (me) and would then be recalculated every time. I guess that fact could jeopardize the entire idea, but maybe it runs somewhat faster in the query. One reason more to include it in the query.


Solution

  • Inside the QUERY you won't have the option to apply the Levenshtein Distance formula. What I suggest to make it more feasible is not to use your function as a custom function inside a cell but to make it calculate once with onEdit.

    function onEdit(e){
      var sh = e.source.getActiveSheet()
      var shname = e.range.getSheet().getName()
      var column = e.range.getColumn()
      var row = e.range.getRow()
      var range = e.range
    
      if(shname == "Expenses"){ 
       if(range.getA1Notation() == "K3"){
         var rangewords = sh.getRange("H:H")
         var words = rangewords.getValues()
         for(i=0;i<words.length;i++){
          if(words[i]=="" || words[i]==null) {return}
          else {sh.getRange(i+1,10).setValue(levenshteinDistanceDP(words[i].toString().toLowerCase(),range.getValue().toString().toLowerCase()))} //change 10 with the number of column you desire to use as auxiliar for this
         }
        }
       else if(column == 8){
        sheet.getRange(row,10).setValue(levenshteinDistanceDP(range.getValue().toString().toLowerCase(),sh.getRange("K3").getValue()].toString().toLowerCase()))  //change 10 with the number of column you desire to use as auxiliar for this}
       }
     }
    
    }
    

    When you change K3 it will re-calculate all the column once, when you change a value in column H it will only calculate that row once. Try it and let me know. You may share a sample file with the functions to debug if necessary