Search code examples
sqlgoogle-bigqueryanalyticsgoogle-analytics-4universal-analytics

Need help recreating a UA advanced segment in BigQuery


I am currently trying to recreate an advanced segment from UA into bigquery and running into issues getting the numbers to match.

I need to exclude all sessions from users who have ever completed an event with a category of 'booking' and all sessions from outside the US. My current query gets me close to the number in our looker dashboard but its less than it should be.

This is my current code:

WITH Excluded_Users AS ( SELECT visitId AS exclusion_id FROM my_data.ga_sessions_*, UNNEST(hits) AS hits WHERE hits.eventInfo.eventCategory = 'Booking' )

SELECT CONCAT( FORMAT("%04d", EXTRACT(YEAR FROM PARSE_DATE("%Y%m%d", date))), '-', FORMAT("%02d", EXTRACT(MONTH FROM PARSE_DATE("%Y%m%d", date))) ) AS year_month,COUNT(DISTINCT IF(totals.visits=1, CONCAT(fullVisitorId, CAST(visitId AS STRING)), NULL) ) as sessions,

FROM my_data.ga_sessions_* s

LEFT JOIN Excluded_Users e ON s.visitId = e.exclusion_id

WHERE e.exclusion_id IS NULL AND s.totals.visits = 1 AND geoNetwork.Country = 'United States'

GROUP BY 1

Solution

  • A couple of things:

    • visitId is just a timestamp of the session (ignoring the midnight split unlike visitStartTime)
    • I think GA UA uses fullvisitorid+visitstarttime to calculate sessions
    • only cross join hits if necessary, save some resources by using subqueries
    • you didn't share your segment setup from UA, but let's hope we can resolve it :)
    SELECT 
      CONCAT( FORMAT("%04d", EXTRACT(YEAR FROM PARSE_DATE("%Y%m%d", date))), '-', FORMAT("%02d", EXTRACT(MONTH FROM PARSE_DATE("%Y%m%d", date))) ) AS year_month,
      COUNT(1) as sessions, 
    -- no need to count(distinct fullvisitorid||visitstarttime) since every row is a session
    FROM `my_data.ga_sessions_*` s
    WHERE 
      NOT exists(select 1 from unnest(hits) where eventInfo.eventCategory = 'Booking')
      AND s.totals.visits = 1 
      AND geoNetwork.Country = 'United States'
    GROUP BY 1