Search code examples
regexstringextractlooker-studiore2

How can I extract file extension from string?


We have a custom field defined in data studio which extracts and returns the file extension from a string in this case it is from the event label.

I have been using the below with some success

REGEXP_EXTRACT(Event Label, '\\.([\\w\\.-]+)$')

However I'm finding if the string contains multiple periods its including that aswell

Eg it's also extracting text like

  • 07.21.pdf
  • 7.22.PDF
  • 07.21.docx
  • docx.pdf

How can I tweak my regex to only include from the last period and ignore any earlier.


Solution

  • You could try replacing [\\w\\.-] with [^\\.]

    \\.([^\\.]+)$
    

    [^\\.] will match everything except for ., so the match will not be able to contain dots inside.

    The full formula would look like this:

    REGEXP_EXTRACT(Event Label, '\\.([^\\.]+)$')