I have come across below query, but I am not able to conclude my analysis related to below query. The main aim of the below query is to convert the number to the alphabet. But the usage of the hierarchical query made me confuse.
merge into s_export ex
using (
select
listagg(n, '') within group (order by lv) new_val,
row_id
from
(
SELECT
connect_by_root rowid row_id,
LEVEL lv,
CASE
WHEN Regexp_like(Regexp_substr( file_as, '[^0-9]+|((\d+)(st|nd|rd|th)?)', 1, LEVEL ), '\d+')
THEN spell_number(
Regexp_substr( file_as, '[^0-9]+|((\d+)(st|nd|rd|th)?)', 1, LEVEL, NULL, 2),
Regexp_substr( file_as, '[^0-9]+|((\d+)(st|nd|rd|th)?)', 1, LEVEL, NULL, 3)
)
ELSE Regexp_substr( file_as, '[^0-9]+|((\d+)(st|nd|rd|th)?)', 1, LEVEL )
END N
FROM s_export d
CONNECT BY NOCYCLE Regexp_substr( file_as, '([^0-9]+)|(\d+)(st|nd|rd|th)?', 1, LEVEL ) IS NOT NULL
and rowid = prior rowid
and prior dbms_random.value is not null
)
group by row_id
) t
on (t.row_id = ex.rowid)
when matched then
update set ex.file_as = t.new_val;
Sample Dataset:
create table s_export (file_as varchar2(2000));
insert into s_export values ('Collection Four') ;
insert into s_export values ('OM_Forty-One One');
insert into s_export values ('OM_Twenty-Two | SOFifteen');
insert into s_export values ('1st');
insert into s_export values ('3M');
insert into s_export values ('Collection Six');
insert into s_export values ('2ND');
insert into s_export values ('11TH100');
Below is my understanding so far:
We are performing an update on the table s_export
column
file_as
whenever there is any number say 1
, it will convert this
to 'one'
.
As far as LEVEL
used in Regexp_substr
work as an occurrence.
I have analyzed the query mentioned above, Below is my analysis for reference.
Consider we have sample data in our table s_export, our Aim here is to convert numeric value appeared in a string into the alphabet.
file_as ROW_ID
------- -------
123Test123 101
We can divide the query into 3 parts starting from the innermost
SELECT
connect_by_root rowid row_id,
LEVEL lv,
CASE
WHEN Regexp_like(Regexp_substr( file_as, '[^0-9]+|((\d+)(st|nd|rd|th)?)', 1, LEVEL ), '\d+')
THEN spell_number(
Regexp_substr( file_as, '[^0-9]+|((\d+)(st|nd|rd|th)?)', 1, LEVEL, NULL, 2),
Regexp_substr( file_as, '[^0-9]+|((\d+)(st|nd|rd|th)?)', 1, LEVEL, NULL, 3)
)
ELSE Regexp_substr( file_as, '[^0-9]+|((\d+)(st|nd|rd|th)?)', 1, LEVEL )
END N
FROM s_export d
CONNECT BY NOCYCLE Regexp_substr( file_as, '([^0-9]+)|(\d+)(st|nd|rd|th)?', 1, LEVEL ) IS NOT NULL
and rowid = prior rowid
and prior dbms_random.value is not null
If there is any string in file_as column say in our case "123Test123", this query will take row_id of the particular row and get the substring from them eg consider row_id of this string as 101 now there will be 3 rows with the same row_id the only difference will be no of the level(No of splits) and every numeric will be converted into the alphabet, using (spell_number)
Original value : 123Test123
ROW_ID LV N
------- ------- -------
101 1 One Hundred Twenty-Three
101 2 Test
101 3 One Hundred Twenty-Three
select
listagg(n, '') within group (order by lv) new_val,
row_id
from (
---Query from Step 1
) group by row_id
Now with the result acheaved from the above output we will perform "listagg" (like pivot the column to row) and concatinate the values of column "N", acheaved from the inner most query to obtain single string value on the basis of grouping them with respect to row_id,
NEW_VAL ROW_ID
------- -------
One Hundred Twenty-Three Test One Hundred Twenty-Three 101
merge into s_export ex
using (
--Query from Step 2 (Including Step 1)
) t
on (t.row_id = ex.rowid)
when matched then
update set ex.file_as = t.new_val;
Now, we will merge "NEW_VAL", obtained from above query to "file_as" column of s_export, on the basis of row_id.
Final Transformed result :
file_as ROW_ID
------- -------
One Hundred Twenty-Three Test One Hundred Twenty-Three 101