Search code examples
sqlregexgoogle-bigqueryregexp-replace

regex_replace value that starts with char, followed by integers BigQuery SQL


I have a table in BigQuery that I need to clean up. In column content_column I have values like:

content_column
/a68786776
/g263647
/g47664
/galleries
/music

I want to replace all values that start with /g followed by any number of integers to a string gallery. The desired output is:

content_column
/a68786776
gallery
gallery
/galleries
/music

I have tried the following:

SELECT regexp_replace(content_column,r'\/(\w+\d+)\/', "gallery") as content_column
FROM `my_table`

This works but it also converts /a68786776 to gallery. I tried to replace (\w+\d+) to (\^g\d+) but that didn't convert anything to gallery.

Thanks for help in advance.


Solution

  • Consider below approach

    select if(
        regexp_contains(content_column,r'^/g\d+'), 
        'gallery', 
        content_column
      ) as content_column
    from `my_table`         
    

    if applied to sample data in your question - output is

    enter image description here