I am trying to get calculate and retrieve some indicators from mutiple tables I have in my dataset on bigquery. I am want to invoke nesting on sfam
which is a column of strings which I can't do for now i.e. it could have values or be null. So the goal is to transform that column into an array/record...that's the idea that came to mind and I have no idea how to go about doing it.
The product
and cart
are grouped by key_web
, dat_log
, univ
, suniv
, fam
and sfam
.
The data is broken down into universe refered to as univ
which is composed of sub-universe refered to as suniv
. Sub-universes contain families refered to as 'fam' which may or may not have sub-families refered to as sfam
. I want to invoke nesting on prd.sfam
to reduce the resulting columns.
The data is collected from Google Analytics for insight into website trafic and users activities.
I am trying to get information and indicators about each visitor, the amount of time he/she spent on particular pages, actions taken and so on. The resulting table gives me the sum of time spent on those pages, sum of total number of visits for a single day and a breakdown to which category it belongs, thus the univ
, suniv
, fam
and sfam
columns which are of type string (the sfam could be null since some sub-universes suniv only have families fam
and don't go down to a sub-family level sfam
.
dat_log
: refers to the date
nrb_fp
: number of views for a product page
tps_fp
: total time spent on said page
I tried different methods that I found online but none worked, so I post my code and problem in hope of finding guidance and a solution !
A simpler query would be:
select
prd.key_web
, dat_log
, prd.nrb_fp
, prd.tps_fp
, prd.univ
, prd.suniv
, prd.fam
, prd.sfam
from product as prd
left join cart as cart
on prd.key_web = cart.key_web
and prd.dat_log = cart.dat_log
and prd.univ = cart.univ
and prd.suniv = cart.suniv
and prd.fam = cart.fam
and prd.sfam = cart.sfam
And this is a sample result of the query for the last 6 columns in text and images:
Again, I want to get a column of array as sfam
where I have all the string values of sfam
even nulls.
I limited the output to only only the last 6 columns, the first 3 are the row, key_web
and dat_log
. Each fam
is composed of several sfam
or none (null), I want to be able to do nesting on either the fam
or sfam
.
Obviously, in BigQuery you cannot output array which holds NULL, but if for some reason you need to preserve them somehow - the workaround is to create array of structs as opposed to arrays of single elements
For example (BigQuery Standard SQL) if you try to execute below
SELECT ['a', 'b', NULL] arr1, ['x', NULL, NULL] arr2
you will get error: Array cannot have a null element; error in writing field arr1
While if you will try below
SELECT ARRAY_AGG(STRUCT(val1, val2)) arr
FROM UNNEST(['a', 'b', NULL]) val1 WITH OFFSET
JOIN UNNEST(['x', NULL, NULL]) val2 WITH OFFSET
USING(OFFSET)
you get result
Row arr.val1 arr.val2
1 a x
b null
null null
As you can see - approaching this way - you can have have even both elements as NULL