Search code examples
regexlooker-studioregular-language

Cannot seem to regexp extract a word on URL (Data Studio)


I read from somewhere that Datastudio use little bit different Regular Expression from other places: that it uses RE2. I, however, manage to find a site to test for RE2 regex and able to get it running, but it was not working on Data studio.

I have this URL I wanted to extract:

/marketing/news-717777

/finance/news-123456?asdasdasd_asdad

I wanted the regex to extract the word with dash and number. "news-******". The result would be like this

news-717777
news-123456

I cannot seem to get it to work on data studio. The code that I have tried are the following:

(news-).*(?=\?)|(news-).*
(news-).*(?=\\?)|(news-).*
(news-.*?)\?
(news-.*).*(?=\?)

The closest I get is to get news with number"news-***", but I cannot remove the "?" that comes after. Anyone has any ideas on this? Thank you in advance.


Solution

  • You can use several solutions here.

    Solution 1: matching digits after a specific string (here, news-)

    (news-[0-9]+)
    

    See the regex demo, [0-9]+ matches one or more digits.

    Solution 2: If there can be any char other than ? after news-, if there can be chars other than digits, you can use

    (news-[^?]+)
    

    See this regex demo, where [^?]+ matches one or more chars other than a ? char.