I am trying to select hits.page.searchKeyword in a query, but I get the classical nested error. Do I have another option rather than unnest?
When I unnest data from hits (GA3 - UA) all my other selects duplicate their value, and it's a mess to fix. I just need to add the search term to my query, which I have identified in the scheme as hits.page.searchKeyword, and I am done. But I need to unnest the hits array without unnesting function.
SELECT
parse_date('%Y%m%d', date) AS date,
channelGrouping,trafficSource.campaign,
hits.page.searchKeyword as searchterm,
SUM(totals.visits) AS sessions,
count(distinct clientId),
SUM(totals.transactions) AS transactions,
SUM(totals.totalTransactionRevenue) AS revenue1mil
FROM `vdxl-prod-data-reporting-01.200759185.ga_sessions_*`
WHERE _TABLE_SUFFIX BETWEEN '20230101' AND '20231102'
GROUP BY 1, channelGrouping, trafficSource.campaign, hits.page.searchKeyword
ORDER BY 1 DESC
Error:
Cannot access field page on a value with type
ARRAY<STRUCT<hitNumber INT64, time INT64, hour INT64, ...>> at [4:8]
UNNEST()
simply turns an array into a relation. It doesn't automatically join that relation with the main table. That only happens if you do something like FROM `vdxl-prod-data-reporting-01.200759185.ga_sessions_*` t cross join t.hits
(sometimes the cross join
is replaced with its alias ,
)
hits.page.searchKeyword
is the internal search and means there can be multiple of these per session - how would you like to aggregate them to session scope?
One way could be to get a distinct list sorted by these keywords:
SELECT
parse_date('%Y%m%d', date) AS date,
channelGrouping,
trafficSource.campaign,
(select string_agg(distinct page.searchKeyword order by page.searchKeyword) from unnest(hits)) as searchterm,
SUM(totals.visits) AS sessions,
count(distinct clientId),
SUM(totals.transactions) AS transactions,
SUM(totals.totalTransactionRevenue) AS revenue1mil
FROM `vdxl-prod-data-reporting-01.200759185.ga_sessions_*`
WHERE _TABLE_SUFFIX BETWEEN '20230101' AND '20231102'
GROUP BY 1, 2, 3, 4
ORDER BY 1 DESC
If you want one keyword per row you need the cross join - and of course it repeats session scoped rows for every hit in that row ... that's what joins do. You need to calculate your metrics on hit scope then and can't use session aggregates/totals anymore:
SELECT
parse_date('%Y%m%d', date) AS date,
channelGrouping,
trafficSource.campaign,
h.page.searchKeyword,
count(distinct fullvisitorid || visitstarttime) AS sessions,
count(distinct clientId) as clients,
countif(h.ecommerceaction.action_type='6') AS transactions,
SUM(h.transaction.transactionrevenue /1000000) AS revenue
FROM `vdxl-prod-data-reporting-01.200759185.ga_sessions_*` as t
CROSS JOIN t.hits as h
WHERE
_TABLE_SUFFIX BETWEEN '20230101' AND '20231102'
GROUP BY
1, 2, 3, 4
ORDER BY
1 DESC, 2, 3, 4