Search code examples
sqlregexpostgresqlregex-group

How to get second material name from product description


Product description is string like

Main fabric: 50% Organic Cotton, 49% Cotton, 1% Elastane

Material names are between % and , or up to end of string for last material name Elastane.

Tried to get second material name "Cotton" using

select (regexp_matches('Main fabric: 50% Organic Cotton, 49% Cotton, 1% Elastane', '%\s(.+),?','g'))[2]

but query does not return any data.

How to get second product name? Using PostgreSql 16


Solution

  • Again, regexp_substr() can handle that:
    demo at db<>fiddle

    select regexp_substr('Main fabric: 50% Organic Cotton, 49% Cotton, 1% Elastane', 
                        '% ([^,]+),?',1,2,'',1)
    
    regexp_substr
    Cotton
    • You're looking for % followed by the material name
    • The material name is expected to be anything after that, up until the end or nearest comma. The [^,]+ means one or more character other than the comma, so it tolerates spaces, hyphenated names etc.
    • ,? mean at most one comma, so that you don't need a trailing comma at the end to get a match

    The parameters after the pattern are the same as in the post in the other thread.

    • 1 tells it to start at the beginning
    • 2 means you want the second match
    • '' is where you'd add flags, like 'i' for case insensitive matching
    • 1 means the first parenthesised sub-expression within the pattern, so it'll only catch the material name, without the leading % and the optional trailing comma.