Search code examples
regexpostgresqlcastingregex-group

postgresql regex grouping + casting issues when extracting price text


In postgresql, I am trying to return a formatted price value for a price in a text field, but ONLY when the text field value doesn't contain other superfluous text.

For example, with these three text values...

  • 'Offers over $580,000'

  • '499000'

  • 'NOW $419,000 - WAS $459,000'

I would want to return...

  • 'Offers over $580,000'

  • '$499,000'

  • 'NOW $419,000 - WAS $459,000'

I have been attempting to use regex to extract my value and then casting to decimal and then formatting the output to get the value I want. For example...

SELECT
CASE WHEN COALESCE(SUBSTRING(btrim('499000'::text) FROM '(^(\$?[0-9,]*)(\.[0-9]{2})?)$'),'') != '' 
THEN to_char(substring('499000' FROM '^((\$?[0-9,]*)(\.[0-9]{2})?)$')::decimal, 'FM$999,999,999'::text) 
ELSE '499000' 
END as testresult

When I swap out the '499000' for 'Offers over $580,000' I get an error because regex is still picking this up as valid and then trying to convert it to decimal. However I'm having trouble understanding why this second value passes the regex because I am saying that the string must start with the option dollar-sign + number combo. I think.

Can anyone point me in the right direction on this one? I have seen numerous other regex posts but none that help me answer this question (happy to be pointed to another post if it contain the solution).

Thanks!


Solution

  • Here's a regex option you may find convenient. It keeps decimals when there are any, and uses PostgreSQL's native currency formatting:

    with vals(val) as (values ('Offers over $580,000'::text), ('499000'), ('NOW $419,000 - WAS $459,000'), ('  $ -1.87857 '), (' - $1.87857 '))
    select case when val ~ '^\s*(-?\s*\$?|\$?\s*-?)[0-9,]+\.[0-9]+\s*$' then val::money::text 
                when val ~ '^\s*(-?\s*\$?|\$?\s*-?)[0-9,]+\s*$' then left(val::money::text, -3)
                else val end
    from vals