We are converting legacy data to a new system. The users requested that we pull a PIN number out of a text comment field in the old system and place it in its own field in the new system.
The PIN number is indicated by the text "PIN" or "PN", with or without a pound sign, with varying amounts of white space in between. There can be any amount of letters or numbers surrounding the PIN. Sometimes the number comes before the text (9999 PIN).
I was hoping to use regular expressions, starting with something like this: (?<=PIN)\d{0,20}. When I put this into a regular expression tester it works and pulls '456' out of '123PIN456'. However we are using Oracle PL/SQL, which doesn't support the lookbehind function.
At this point I'm looking at using substr, replace, or other Oracle functions and writing separate lines to handle each possible permutation. I was hoping there was some better, faster, more efficient way.
Here are a handful of examples of the source data. I want the bold numbers only.
9999XX00999X PIN # 999999
9999XX000999 - PIN#909999
XXXXX XX PIN #9099999
9999XX99X PIN999099
99XX999X PN 9999999
9999999 PIN
90PA999 PN 999999
999009 PIN#
99PA99 PN#9999999
PIN 999990
XXXPIN990999
PIN#999099
PIN# 9990909
PIN#9099999*
You can use REGEXP_REPLACE
, like this:
select REGEXP_REPLACE(s, '^.*?PI?N ?#? ?(\d{1,20}).*$|^.*?(\d{1,20}) ?PI?N.*$', '\1\2')
from test;
The idea is to match the entire string from ^
to $
, but put only the pin portion into the capturing group parentheses. After that use \1
and \2
syntax to extract the value of capturing groups.