I need to extract values delimited by semicolon in a text. I use regexp_substr which is very powerful but the string has a limit. I have sometimes a text with 10 000 characters so is not possible to use such string in this inbuild function.
the text looks like this - AU783 343; 3N9493 ;113 UN9 WE3 54 ; OI8343, ; 43U.783.3
just 0-9 and A-Z , if there is a colon or period then it should be deleted. the output must be as it is but withou leading and trailing spaces
AU783 343
3N9493
113 UN9 WE3 54
OI8343
43U7833
any suggestion how to avoid the max length limit?
You don't need to use regular expressions. The simple string functions SUBSTR
, INSTR
and REPLACE
are sufficient and work with CLOB
values.
For example, given the table:
CREATE TABLE table_name (
id NUMBER
GENERATED ALWAYS AS IDENTITY
PRIMARY KEY,
value CLOB
);
You can extract all the substrings using:
WITH bounds (id, value, s_pos, e_pos) AS (
SELECT id,
value,
1,
INSTR(value, ';', 1)
FROM table_name
UNION ALL
SELECT id,
value,
e_pos + 1,
INSTR(value, ';', e_pos + 1)
FROM bounds
WHERE e_pos > 0
)
SEARCH DEPTH FIRST BY id SET id_order
SELECT id,
REPLACE(
REPLACE(
CASE e_pos
WHEN 0
THEN SUBSTR(value, s_pos)
ELSE SUBSTR(value, s_pos, e_pos - s_pos)
END,
':'
),
'.'
) AS value
FROM bounds;
See db<>fiddle for an example with very large strings.