Search code examples
oracle-databasedatabase-performancenotin

Rewrite NOT IN, but subquery involves a comma seperated string (ID)


Changing Oracle stored procedure using SQL Developer.

Input: A comma separated IDs. (Example: 'P23,P37,P39,P45') Edit: please note that the input is a string, not an array of string. Also the string could be more than just 4 IDs. Could go up to about 200.

Want to find out from the table that does NOT have those input IDs.

The following is too slow. Just about 300 rows of data (in table) but takes about 20 seconds. So I want to rewrite. Please give me some hints on how to do.

ID_Array is 'P23,P37,P39,P45'.

SELECT * FROM StudentInfo
WHERE StudentClass = 'Primary5A'
AND StudentID NOT IN
(
    SELECT REGEXP_SUBSTR(ID_Array, '[^,]+', 1, LEVEL) StudentID
    FROM DUAL
    CONNECT BY REGEXP_SUBSTR(ID_Array, '[^,]+', 1, LEVEL) IS NOT NULL   
)
AND Height <= 150;

Some of you may know it already. The following

    SELECT REGEXP_SUBSTR(ID_Array, '[^,]+', 1, LEVEL) StudentID
    FROM DUAL
    CONNECT BY REGEXP_SUBSTR(ID_Array, '[^,]+', 1, LEVEL) IS NOT NULL   

will turn ID_Array into a table (table-like structure?) with four rows:

+-----+
| P23 |
| P37 |
| P39 |
| P45 |
+-----+

Solution

  • Your ID_Array must be a lot longer than your example here. I get very good performance with 'P23,P37,P39,P45'.

    With longer strings, REGEXP_SUBSTR can get pretty slow. I'd suggest using LIKE instead whenever possible, even if it gets weird. Try this one.

    SELECT * FROM StudentInfo
    WHERE StudentClass = 'Primary5A'
    AND ','||ID_Array||',' NOT LIKE '%,'||StudentID||',%'
    AND Height <= 150;