Search code examples
google-bigquerybigquery-udf

Cannot use bigquery udf (bqutil) in processing location: us-west-2


We are trying to use these in us-west2 - https://github.com/GoogleCloudPlatform/bigquery-utils/tree/master/udfs/community.

this first query processes just fine, in US enter image description here

this second query wont run enter image description here

Our dataset models is in us West 2. It seems all queries from the 2nd query editor are then processed in us-west 2 where, it seems bqutil does not exist? How can we find the function bqutil.fn.levenshtein when processing in us-west2 (where our datasets all exist)?


Solution

  • To use the levenshtein UDF in your BigQuery table, you need to create a UDF in the location where your dataset resides.

    You can refer to the below UDF and the screenshot where the data resides in us-west2 location.

    UDF :

    CREATE OR REPLACE FUNCTION
    `stackdemo.fn_LevenshteinDistance`(in_a STRING, in_b STRING) RETURNS INT64 LANGUAGE js AS R"""
    
    var a = in_a.toLowerCase();
    var b = in_b.toLowerCase();
     if(a.length == 0) return b.length;
    if(b.length == 0) return a.length;
    var matrix = [];
    // increment along the first column of each row
    var i;
    for(i = 0; i <= b.length; i++){
     matrix[i] = [i];
    }
    // increment each column in the first row
    var j;
    for(j = 0; j <= a.length; j++){
     matrix[0][j] = j;
    }
    // Fill in the rest of the matrix
    for(i = 1; i <= b.length; i++){
     for(j = 1; j <= a.length; j++){
       if(b.charAt(i-1) == a.charAt(j-1)){
         matrix[i][j] = matrix[i-1][j-1];
       } else {
         matrix[i][j] =
           Math.min(matrix[i-1][j-1] + 1, // substitution
           Math.min(matrix[i][j-1] + 1, // insertion
           matrix[i-1][j] + 1)); // deletion
       }
     }
    }
    return matrix[b.length][a.length];
    """;
    
    

    Query :

    SELECT
     source,
     target,
     `stackdemo.fn_LevenshteinDistance`(source, target) distance,
    FROM UNNEST([
     STRUCT('analyze' AS source, 'analyse' AS target),
     STRUCT('opossum', 'possum'),
     STRUCT('potatoe', 'potatoe'),
     STRUCT('while', 'whilst'),
     STRUCT('aluminum', 'alumininium'),
     STRUCT('Connecticut', 'CT')
    ]);
    
    

    Output :

    enter image description here