I have this text:
Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples
I just want to get the part after 'Process explanation' but not include 'final activity...'
So like this:
The bottle is then melted to form liquid glass.
This is the current hive query which I want to convert to oracle:
SELECT REGEXP_EXTRACT(
'Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples',
'.*(process[ \t]*(explanation)?[ \t]*:[ \t]*)(.*?)([ \t]*;[ \t]*final[ \t]+activity[ \t]+for[ \t]+manager.*$|$)',
3) as extracted
FROM my_table
I've come up with something like this:
with strings as
(SELECT '1Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples' str FROM DUAL
union all
SELECT '2Process explanation:The bottle is then melted to form liquid glass;' str FROM DUAL
union all
SELECT '3Process :The bottle is then melted to form liquid glass' str FROM DUAL
union all
SELECT '4Process explanation: plasma gasification combined with centrifugal activity' str FROM DUAL
union all
SELECT '5Final activity for manager:Labeling of previous samples' str FROM DUAL
)
SELECT str
, REGEXP_SUBSTR(
str,
'(.*process[[:blank:]]*(explanation)?[[:blank:]]*:[[:blank:]]*)([A-Za-z0-9 ]*)([[:blank:]]*;[[:blank:]]*final[[:blank:]]*activity[[:blank:]]*for[[:blank:]]*manager.*$)?',
1, 1, 'i',3)
as extracted
FROM strings
Resulting in:
STR | EXTRACTED |
---|---|
1Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples | The bottle is then melted to form liquid glass |
2Process explanation:The bottle is then melted to form liquid glass; | The bottle is then melted to form liquid glass |
3Process :The bottle is then melted to form liquid glass | The bottle is then melted to form liquid glass |
4Process explanation: plasma gasification combined with centrifugal activity | plasma gasification combined with centrifugal activity |
5Final activity for manager:Labeling of previous samples | - |
Assuming matching blank group instead of your space and tab list [ \t] is ok. Edit: Modified the regexp a bit cause with possibility of last group being empty '.*' kept catching entire line.