I have a problem with a script that migrates data from an old table to a new one. One of the columns in the old table is a CLOB but in the new table it is VARCHAR2. I tried to insert those using the code a below. But I got in trouble with error:
ORA-06502: PL/SQL: numeric or value error: character string buffer too small.
DECLARE
CURSOR CUR IS
SELECT T.*
FROM ACTIVITY_EVENT T
WHERE T.POST_TEXT IS NOT NULL;
R CUR%ROWTYPE;
BEGIN
FOR R IN CUR
LOOP
INSERT INTO STREAM_TEXT
WITH STR AS
(SELECT T.*
FROM STREAM T
WHERE T.OLD_ID = R.ID)
SELECT SEQ$STREAM_TEXT.NEXTVAL AS ID,
DBMS_LOB.SUBSTR(T.POST_TEXT, 4000, 1) AS TEXT,
T.DT AS DT,
'READY' AS STATE,
STR.ID AS STREAM_ID
FROM ACTIVITY_EVENT T
LEFT JOIN STR
ON STR.OLD_ID = T.ID
WHERE T.POST_TEXT IS NOT NULL
AND STR.OLD_ID = T.ID;
END LOOP;
END;
I did this code without the loop first and got the same problem so I tried to create a loop. But the result is the same.
This simple query fails with the same error:
SELECT T.ID, DBMS_LOB.SUBSTR(T.POST_TEXT, 4000, 1)
FROM ACTIVITY_EVENT T
WHERE T.POST_TEXT IS NOT NULL
You'll get this error if your source column is actually a NCLOB, rather than a CLOB. This is OK:
create table t42 (id number, dt date, post_text clob);
insert into t42 (id, dt, post_text) values (1, sysdate, dbms_random.string('p', 4000));
select id, dbms_lob.substr(post_text, 4000, 1) from t42;
But this errors, just changing CLOB to NCLOB, with any length above 2000:
create table t42 (id number, dt date, post_text nclob);
insert into t42 (id, dt, post_text) values (1, sysdate, dbms_random.string('p', 4000));
select id, dbms_lob.substr(post_text, 4000, 1) from t42;
SQL Error: ORA-06502: PL/SQL: numeric or value error: character string buffer too small
ORA-06512: at line 1
This is with AL32UTF8 and AL16UTF16 as the database and national character sets.
So if your source table is NCLOB, you can only extract the first 2000 characters to put into your stream_text
table.
You will also see this if the source column is a CLOB and the first 4000 characters contain any multibyte characters. dbms_log.substr(x, 4000, 1)
always gets the first 4000 characters for a CLOB, which may be more than 4000 bytes - which is the maximum size of a VARCHAR2 value in a SQL context, even if it's declared as varchar2(4000 char)
, as it still can't exceed the 4000-byte limit.
If you want to get the maximum 4000 characters out then you can do that via a PL/SQL VARCHAR2 variable, with an additional substrb()
call:
DECLARE
CURSOR CUR IS
SELECT SEQ$STREAM_TEXT.NEXTVAL AS ID,
T.POST_TEXT,
T.DT AS DT,
'READY' AS STATE,
S.ID AS STREAM_ID
FROM ACTIVITY_EVENT T
LEFT JOIN STREAM S
ON S.OLD_ID = T.ID
WHERE T.POST_TEXT IS NOT NULL;
TMP_TEXT VARCHAR2(4000);
BEGIN
FOR R IN CUR
LOOP
TMP_TEXT := SUBSTRB(DBMS_LOB.SUBSTR(R.POST_TEXT, 4000, 1), 1, 4000);
INSERT INTO STREAM_TEXT (ID, TEXT, DT, STATE, STREAM_ID)
VALUES (R.ID, TMP_TEXT, R.DT, R.STATE, R.STREAM_ID);
END LOOP;
END;
/
The substrb(..., 1, 4000)
part won't work in SQL either, as the inner expression is still too big, but it does work in PL/SQL. You get the first 4000 bytes of the first 4000 characters. (Though you may have a problem still if the 4000th bytes is part-way through a multi-byte character).
I've guessed the column names in the target table, so use the real ones, obviously. Doing a bulk insert would be better if you have a lot of data; fetching into a collection and using FORALL to insert in bulk rather than row-by-row; something like this would work as a starting point:
DECLARE
TYPE TMP_REC_TYPE IS RECORD (
ID STREAM_TEXT.ID%TYPE,
POST_TEXT ACTIVITY_EVENT.POST_TEXT%TYPE,
TEXT STREAM_TEXT.TEXT%TYPE,
DT STREAM_TEXT.DT%TYPE,
STATE STREAM_TEXT.STATE%TYPE,
STREAM_ID STREAM_TEXT.STREAM_ID%TYPE
);
TYPE TMP_REC_TAB_TYPE IS TABLE OF TMP_REC_TYPE;
TMP_REC_TAB TMP_REC_TAB_TYPE;
RC SYS_REFCURSOR;
BEGIN
OPEN RC FOR
SELECT SEQ$STREAM_TEXT.NEXTVAL AS ID,
T.POST_TEXT,
NULL AS TEXT,
T.DT AS DT,
'READY' AS STATE,
S.ID AS STREAM_ID
FROM ACTIVITY_EVENT T
LEFT JOIN STREAM S
ON S.OLD_ID = T.ID
WHERE T.POST_TEXT IS NOT NULL;
LOOP
FETCH RC BULK COLLECT INTO TMP_REC_TAB LIMIT 100;
FOR I IN 1..TMP_REC_TAB.COUNT LOOP -- populate text field
TMP_REC_TAB(I).TEXT := SUBSTRB(
DBMS_LOB.SUBSTR(TMP_REC_TAB(I).POST_TEXT, 4000, 1), 1, 4000);
END LOOP;
FORALL I IN 1..TMP_REC_TAB.COUNT -- bulk insert
INSERT INTO STREAM_TEXT (ID, TEXT, DT, STATE, STREAM_ID)
VALUES (TMP_REC_TAB(I).ID, TMP_REC_TAB(I).TEXT, TMP_REC_TAB(I).DT,
TMP_REC_TAB(I).STATE, TMP_REC_TAB(I).STREAM_ID);
EXIT WHEN RC%NOTFOUND;
END LOOP;
END;
/