I have an ETL task for datawarehouse-ing purposes, I need to extract the second part of a String after a delimiter occurence such as: '#', 'ý', '-'. For example test case string:
'Tori 1#MHK-MahallaKingaveKD' I should retrieve only 'MHK'
'HPHelm2ýFFS-Tredddline' I should retrieve only 'FFS'
I already tried using the cases above:
TRIM(CASE
WHEN INSTR('HPHelm2ýFFS-Tredddline', '#',1,1) > 0
THEN (REPLACE(
REGEXP_SUBSTR('HPHelm2ýFFS-Tredddline', '[^#]+', 1,2),
'#'
))
ELSE (CASE
WHEN INSTR('HPHelm2ýFFS-Tredddline', '-',1,1) > 0
THEN (REPLACE(
REGEXP_SUBSTR('HPHelm2ýFFS-Tredddline', '[^-]+', 1,2),
'-'
))
ELSE (CASE
WHEN INSTR('HPHelm2ýFFS-Tredddline','-') = 0 AND INSTR('HPHelm2ýFFS-Tredddline','ý') = 0 AND INSTR('HPHelm2ýFFS-Tredddline','#') = 0
THEN 'HPHelm2ýFFS-Tredddline'
ELSE (CASE
WHEN INSTR('HPHelm2ýFFS-Tredddline','ý',1,1) > 0
THEN (REPLACE(
REGEXP_SUBSTR('HPHelm2ýFFS-Tredddline', '[^ý]+', 1,2),
'ý'
))
END)
END)
END)
END)
Using the code above I can retrieve:
'Tori 1#MHK-MahallaKingaveKD' ====> 'MHK-MahallaKingaveKD'
'HPHelm2ýFFS-Tredddline' ====> 'FFS-Tredddline'
Expected output:
'Tori 1#MHK-MahallaKingaveKD' ====> 'MHK'
'HPHelm2ýFFS-Tredddline' ====> 'FFS'
So I have to exclude '-' and the string after.
I guess I should modify the regexp_substr pattern but can't seem to find a clear solution since '-' is specified in the case when statements as a delimiter.
I suggest retrieving the second occurrence of 1+ chars other than your delimiter chars:
regexp_substr(col, '[^#ý-]+', 1, 2)
Here, the search starts with the first char in the record (1
), and the second occurrence is returned (2
).
The [^#ý-]+
pattern matches one or more (+
) chars other than #
, ý
and -
.