Search code examples
sqlregexpostgresqlsubstring

Substring before first uppecase word excluding first word


String contains words separated by spaces.

How to get substring from start until first uppercase word (uppercase word excluded)? If string start with uppercase, this word should included. Search should start from secord word. First word should always appear in result.

For example

select substringtiluppercase('Aaa b cC Dfff dfgdf')

should return

Aaa b cC

Can regexp substring used or other idea?

Using PostgreSQL 13.2

Uppercase letters are latin letters A .. Z and additionally Õ, Ä, Ö , Ü, Š, Ž


Solution

  • Replace everything from a leading word boundary then an uppercase letter onwards with blank:

    regexp_replace('aaa b cc Dfff dfgdf', '(?<!^)\m[A-ZÕÄÖÜŠŽ].*', '')
    

    See live demo.

    In Postgres flavour of regex, \m "word boundary at the beginning of a word".

    (?<!^) is a negative look behind asserting that the match is not preceded by start of input.

    fyi the other Postgres word boundaries are \M at end of a word, \y either end (same as usual \b) and \Y not a word boundary (same as usual \B).