I wonder whether someone may be able to help me please.
My eventinfo.eventLabel field in my bigquery table contains the following data:
View View Code red for the BDAT group
SELECT
#select all fields excluding those under the hits record
* EXCEPT (hits),
#start array - this rebuilds the hit record
ARRAY(
SELECT
#unnest the hit field, select each field excluding those under the page record
AS STRUCT * EXCEPT (eventInfo ),
(
SELECT
#select all page fields excluding pageTitle
AS STRUCT eventInfo.* EXCEPT (eventLabel),
#remove the query parameter from the pagePath fields
REGEXP_REPLACE(eventinfo.eventLabel, r'View\sView .*', 'View View redacted') AS eventLabel) AS eventinfo
FROM
UNNEST(hits) ) AS hits
FROM
`bigquery.Tested.ga_sessions_20180801`
I'm trying to remove the "Code red" element of the data and replace it with the term "redacted".
Please note that "Code red" is not the only value and the number of characters can increase and decrease. But the "View View" and "for the" are constant throughout my data.
I know that the problem is the REGEXP REPLACE line, and I can manage to put the "redacted" text into the field, but I've not been able to remove the "Code red" text.
Could someone possibly look at this and offer some guidance on how I may be able to change this.
Many thanks and kind regards
Chris
select
[
REGEXP_REPLACE('View View Code red for the BDAT group', r'View View .*? for the', 'View View redacted for the'),
REGEXP_REPLACE('View View Code blue for the BDAT group', r'View View .*? for the', 'View View redacted for the'),
REGEXP_REPLACE('View View red code for the BDAT group', r'View View .*? for the', 'View View redacted for the')
]