In Oracle I had this code:
regexp_replace(DESCRIPTION_FIELD, '[^0-9.]+', '')
that worked fine, but now I'm getting the error:
ORA-01722 "Invalid Number".
Removing the decimal fixes this, but removes the decimal from the returning value. I'm guessing there's some bad data added recently. Solutions?
The regexp_replace()
isn't generating that error; the problem is when you cast the result of that replacement to a number. For example, for the original value XYZ2626...266.88
your pattern brings back 2626...266.88
, and to_number('2626...266.88')
throws ORA-01722.
by finding numbers with a format of "Some numbers, a decimal, then two numbers"
You could look for that rather than trying to exclude other characters:
with your_table (description_field) as (
select 'No money value' from dual
union all
select 'Some sensible 98765.43 value' from dual
union all
select '01234-1234545 54.00' from dual
union all
select 'XYZ2626...266.88' from dual
union all
select 'ABC-123.45XYZ' from dual
union all
select 'ABC123.45XYZ6.78' from dual
)
select description_field,
regexp_replace(DESCRIPTION_FIELD, '[^0-9.]+', '') as original,
regexp_replace(DESCRIPTION_FIELD, '.*?((-?\d+\.\d{2})[^0-9]*)?$', '\2') as new
from your_table;
DESCRIPTION_FIELD ORIGINAL NEW
---------------------------- -------------------- --------------------
No money value
Some sensible 98765.43 value 98765.43 98765.43
01234-1234545 54.00 01234123454554.00 54.00
XYZ2626...266.88 2626...266.88 266.88
ABC-123.45XYZ 123.45 -123.45
ABC123.45XYZ6.78 123.456.78 6.78
I've allowed for negative numbers but you might not want those... and if there is more than one potential money values it'll take the last one.
The capturing group (-?\d+\.\d{2})
looks for an optional minus sign, followed by any number of digits, followed by a period, followed by exactly 2 digits. But that on its own wouldn't prevent further digits afterwards, so it's followed by [^0-9]*
to make sure that doesn't happen. That combination is enclosed in a second grouping to allow it to be optional (followed by ?
) - otherwise values without anything that looks like a money amount are passed through unaltered, which will also error presumably.