Search code examples
machine-learninggoogle-bigquerybigdata

Why isnt ML.NGRAM not supported in transform clause in bigQueryML?


I am using the following query to create a model but the editor complains that ML.NGRAM is not supported within the transform clause.

CREATE OR REPLACE MODEL
`singular-hub-291814.movie_sentiment.mymodel3`
TRANSFORM(ML.NGRAM(string_field_0,[1,2])OVER() as ngram )
OPTIONS
  ( model_type='LOGISTIC_REG',
    auto_class_weights=TRUE,
    data_split_method='RANDOM',
    DATA_SPLIT_EVAL_FRACTION = .10,
    input_label_cols=['review']
  ) AS

SELECT 
  string_field_0 , review from table;

Even though the same transformation can be used inside SELECT query.

SELECT 
  ML.NGRAMS(words_array, [1,2]) as ngrams, 
  review
from table;

whereas the other transformation functions like bag_of_words, min_abs_scalar can be used inside transform. Why is this behavior different this way? Is there an explicit list of transformers that cannot be used in the TRANSFORM clause?


Solution

  • I have replicated your inquiry. It appears that code is missing "S" it should be ML.NGRAMS instead of ML.NGRAM

    CREATE OR REPLACE MODEL
    `singular-hub-291814.movie_sentiment.mymodel3`
    TRANSFORM(ML.NGRAMS(string_field_0,[1,2])OVER() as ngram )
    OPTIONS
      ( model_type='LOGISTIC_REG',
        auto_class_weights=TRUE,
        data_split_method='RANDOM',
        DATA_SPLIT_EVAL_FRACTION = .10,
        input_label_cols=['review']
      ) AS
    
    SELECT 
      string_field_0 , review from table;
    

    The error was replaced by missing table name after adding S to the text function.(since I do not have any table atm)