Search code examples
db2db2-400icu

Can I write a PCRE conditional that only needs the no-match part?


I am trying to create a regular expression to determine if a string contains a number for an SQL statement. If the value is numeric, then I want to add 1 to it. If the number is not numeric, I want to return a 1. More or less. Here is the SQL:

SELECT 
   field,
   CASE
     WHEN regexp_like(field, '^ *\d*\.?\d* *$') THEN dec(field) + 1
     ELSE 1
   END nextnumber
FROM mytable

This actually works, and returns something like this:

INVALID     1
00000       1
00001E      1
00379       380
00013       14
99904       99905

But to push the envelope of understanding, what if I wanted to cover negative numbers, or those with a positive sign. The sign would have to immediately precede or follow the number, but not both, and I would not want to allow white space between the sign and the number.

I came up with a conditional expression with a capture group to capture the sign on the front of the number to determine if a sign was allowed on the end, but it seems a little awkward to handle given I don't really need a yes-pattern.

Here is the modified regex: ^ ([+-]?)*\d*\.?\d*(?(1) *|[+-]? *)$

This works at regex101.com, but in order for it to work I need to have something before the pipe, so I have to duplicate the next pattern in both the yes-pattern and the no-pattern.

All that background for this question: How can I avoid that duplication?


EDIT: DB2 for i uses International Components for Unicode to provide regular expression processing. It turns out that this library does not support conditionals like PRCE, so I changed the tags on this question. The answer given by Wiktor Stribiżew provides a working alternative to the conditional by using a negative lookahead.


Solution

  • You do not have to duplicate the end pattern, just move it outside the conditional:

    ^ *([+-])?\d*\.?\d*(?(1)|[+-]?) *$
    

    See the regex demo. So, the yes-part is empty, and the no-part has an optional pattern.

    You may also solve it with a mere negative lookahead:

    ^ *([+-](?!.*[-+]))?\d*\.?\d*[+-]? *$
    

    See another regex demo. Here, ([+-](?!.*[-+]))? matches (optionally) a + or - that are not followed with any 0+ char followed with another + or -.