Search code examples
regexpostgresqltrim

Trimming text from a multi-line entry in postgres with regex


I have a column "verbatim" where each entry contains multiple lines. Here's an example:

Dummy field1:Text

Tell Us More:Text to capture

Dummy field2:Text

I'd like to capture only Text to capture text in the second line Tell Us More: and put that value into the column verbatim_scrubbed. In the example above, Text to capture would be the entry in verbatim_scrubbed.

I'm not that great with postgres and regexp, so I was hoping somebody could help me out here. Was thinking of something similar to the following:

update TABLE
set verbatim_trimmed = array_to_string(regexp_matches(verbatim,'tell us more:(.*)','gi'));

This doesn't work, but I have a feeling something similar may work.


Solution

  • Perhaps there is a direct way to capture the: Text to capture without the cariage return \r and the new line \n charracters (without using the regexp_replace).

    Here is what you can do:

    select regexp_replace(array_to_string(regexp_matches(verbatim, '^Tell Us More:(.*)$','n'),'',''), E'[\\r\\n]', '' ) from my_table;