Search code examples
sqlgoogle-bigqueryflattengoogle-cloud-vertex-ai

How to query / flatten from vertex ml results saved to bigquery


This is driving me bonkers so any help greatly appreciated.

I am using Google's Vertex ML. I have exported a batch prediction to BigQuery.

The schema is I believe a record with repeat fields.

So I think it would like this in JSON:

[{"category":true,"score":.9999},{"category":false,"score",.05}]

I can not figure out how to either unnest or narrow a search where a category is true.

I need to have a flat select that has the correct category column and score value

123 | true | .9999
123 | false | .05

or a select with a where clause to only get true values

123 | .9999

The following unnests everything but it creates four rows joining both the true and false to both the scores.

SELECT
  row_id,
  classes,
  scores
FROM
  `database`
cross JOIN
  UNNEST(exported.classes) AS classes,
  UNNEST(exported.scores) AS scores
LIMIT
  10

creates rows like:

123 | true | .9999
123 | false | .9999
123 | true | .05
123 | false | .05

This does select the values I need but it's still a nested field...

select
row_id,
classes.classes,
classes.scores
from (
SELECT
  voter_id,
  ARRAY_CONCAT([predicted_results]) as the_results
FROM
  `data`
LIMIT
  10
),
unnest(the_results) as classes

creates rows like

123 | [true:.9999,false:.05]

Solution

  • select 
     primary_key, 
      predicted_supports.classes[SAFE_OFFSET(index)] as class,
      predicted_supports.scores[SAFE_OFFSET(index)] as score,
    FROM `database`,
    unnest(generate_array(0,array_length(predicted_supports.classes)-1)) as index
    

    Here is the ouput:

    enter image description here