We are running an older Oracle Server, 10.1.0.5.
I have some simple code replaces one CLOB with another in a different column. I am replacing specific strings with another simpler string in a different column, so I can perform some basic XML Parsing. Ref: here.
I would expect to do a series of update commands, with commits in between each, and have all of the fields updated properly. If I run it as follows, only the last query gets committed, despite running as a script. The intermittent commits don't seem to take.
I can get it to work, by running each update + commit line by line, but I am wondering why I can't run these all at once. I have only shown the first 2 updates. What am I missing?
set define off
update TABLE1 SET COL_CLEANED = replace(COL_ORIGINAL,' ', ' ');
COMMIT;
update TABLE1 SET COL_CLEANED = replace(COL_ORIGINAL,'§', 'SECT.');
COMMIT;
update TABLE1 ...
Thank you,
sse
As @OldProgrammer said, you're repeatedly replacing the COL_CLEANED
value and overwriting earlier changes...
Imagine you started with a simple value that required multiple replacements:
create table table1 (col_original clob, col_cleaned clob);
insert into table1 (col_original)
values (to_clob('Test without breaks - §1'));
and did the first update:
update TABLE1 SET COL_CLEANED = replace(COL_ORIGINAL,' ', ' ');
select col_original, col_cleaned from table1;
COL_ORIGINAL COL_CLEANED
---------------------------------------- ----------------------------------------
Test without breaks - §1 Test without breaks - §1
Now you do the second update:
update TABLE1 SET COL_CLEANED = replace(COL_ORIGINAL,'§', 'SECT.');
select col_original, col_cleaned from table1;
COL_ORIGINAL COL_CLEANED
---------------------------------------- ----------------------------------------
Test without breaks - §1 Test without breaks - SECT.1
There has been no commit and no rollback. But you took the original value - which still had both the old patterns - and replaced the second pattern in that. The first update has simply been overwritten by the second.
You need to apply successive updates to the cleaned value; only the first update uses the original value:
-- first update is based on COL_ORIGINAL
update TABLE1 SET COL_CLEANED = replace(COL_ORIGINAL,' ', ' ');
select col_original, col_cleaned from table1;
COL_ORIGINAL COL_CLEANED
---------------------------------------- ----------------------------------------
Test without breaks - §1 Test without breaks - §1
-- subsequent updates are based on COL_CLEANED to keep earlier changes
update TABLE1 SET COL_CLEANED = replace(COL_CLEANED,'§', 'SECT.');
--------------------------------------------^^^^^^^
select col_original, col_cleaned from table1;
COL_ORIGINAL COL_CLEANED
---------------------------------------- ----------------------------------------
Test without breaks - §1 Test without breaks - SECT.1
The commits aren't really needed, incidentally; you should commit once per logical transaction, not once per statement.
Incidentally, you could use the utl_i18n.unescape_reference()
function for this, but in your example it will give you a mutibyte 'section' character rather than the string 'SECT.'
:
update TABLE1 SET COL_CLEANED = UTL_I18N.UNESCAPE_REFERENCE(COL_ORIGINAL);
select col_original, col_cleaned from table1;
COL_ORIGINAL COL_CLEANED
---------------------------------------- ----------------------------------------
Test without breaks - §1 Test without breaks - §1
and it may make other changes differently to your other hard-coded replacements. On the other hand, at some point your col_original
has probably been passed through a similar function to escape references to the values you currently have stored; as the real original text would have had §
for that to have been escaped to §
, it may be more appropriate to reinstate it to that original character anyway. Unless you're ending up in a character set that can't display it, perhaps.