Using Oracle 12c, how can I use regexp_substr to separate a tab delimited record whose fields may contain spaces? The record has four fields. The third field contains words with spaces.
I'm using this as my reference:Oracle Regex
Here's my query
with tab_delimited_record as
(select 'Field1 Field2 This is field3 and contains spaces Field4' as record_with_fields from dual)
select record_with_fields,
regexp_substr('\S+',1,3) as field3a, -- Expect ==>This is field3...
regexp_substr('\t+',1,3) as field3b, -- Expect==>This is field3...
regexp_substr('[[::space::]]+',1,3) as field_3c -- Another version
from tab_delimited_record
Desired Results
RECORD_WITH_FIELDS
Field1 Field2 This is field3 and contains spaces Field4
FIELD3
This is field3 and contains spaces
I believe you are looking for something like this. Note this example returns all fields for example's sake but of course you can just select field3 if that's all you need. The CTE builds string with tab-delimited fields. The query then uses regex_substr to get the nth (4th argument) string of characters followed by a TAB or the end of the line.
with tab_delimited_record(record_with_fields) as (
select 'Field1'||chr(09)||'Field2'||chr(09)||'This is field3 and contains spaces'||chr(09)||'Field4' from dual
)
select record_with_fields,
regexp_substr(record_with_fields, '(.*?)('||chr(09)||'|$)', 1, 1, null, 1) as field_1,
regexp_substr(record_with_fields, '(.*?)('||chr(09)||'|$)', 1, 2, null, 1) as field_2,
regexp_substr(record_with_fields, '(.*?)('||chr(09)||'|$)', 1, 3, null, 1) as field_3,
regexp_substr(record_with_fields, '(.*?)('||chr(09)||'|$)', 1, 4, null, 1) as field_4
from tab_delimited_record;