Search code examples
sqlregexreplacegoogle-bigqueryregexp-replace

Multiple delimiter process in bigquery sql


I need bigquery sql to process a column with outer-level delimiter ^G and inner-level delimiter space, from example if input column is a 11^Gb 22^Gc 33, then we should give two output columns a^Gb^Gc and 11^G22^G33. Thanks!


Solution

  • You can consider below

    -- sample data
    WITH sample_table AS (
     SELECT 'a 11^Gb 22^Gc 33' str UNION ALL
     SELECT 'a AA^Gb BB^Gc CC' str
    )
    -- query starts here
    SELECT str,
           (SELECT AS STRUCT
                   STRING_AGG(SPLIT(out, ' ')[OFFSET(0)], '^G') col1,
                   STRING_AGG(SPLIT(out, ' ')[OFFSET(1)], '^G') col2
              FROM UNNEST(SPLIT(str, '^G')) out
           ).*
      FROM sample_table;
    
    -- query result
    +------------------+---------+------------+
    |       str        |  col1   |    col2    |
    +------------------+---------+------------+
    | a 11^Gb 22^Gc 33 | a^Gb^Gc | 11^G22^G33 |
    | a AA^Gb BB^Gc CC | a^Gb^Gc | AA^GBB^GCC |
    +------------------+---------+------------+