Search code examples
sqloracle-databaseclob

Clob Issues- Splitting a clob column to multiple rows


I have a table (description of table):

Name Type
KEYVALUE VARCHAR2(100)
TEXT CLOB

Example

Keyvalue Text
101 Customer Input 05/15/2023 07:20:20 My name is ABX +++ Private Notes What is you name+++Customer Input 04/30/2023 19:40:58 I have issue related to water purifier purchased on Jan 23 +++ Public Notes 04/30/2023 18:19:18 +++Customer Input 04/30/2023 Requesting to send a technicial, we could not bring them up due to the same issue that was looked into in ticket 20092. We dont know if this is the same issue as the previous ticket, but need to know the reason Language Preference: English
102 Customer Input 05/15/2023 07:20:20 20424596 Reference to the above ticket+++Customer Input 04/30/2023 19:40:58 Plesae replace the item as this is a faulty one +++ Public Notes 04/30/2023 18:19:18 +++Customer Input 04/30/2023 17:54:54 Shared the faulty machine pics for quick action Problem Context: When was the issue first observed? - 4/30, 1AM or so Were there any recent changes or maintenance performed? - Language Preference: English

I am basically splitting the entire phase to multiple rows by the word "customer Input". Something like below:

SELECT distinct keyvalue, level pos, trim(regexp_substr(text, 'Customer Input[^+++]*', 1, level))  x
  FROM 
  (
    SELECT 101 as keyvalue,'Customer Input 05/15/2023 07:20:20 My name is ABX +++ Private Notes What is you name+++Customer Input 04/30/2023 19:40:58 I have issue related to water purifier purchased on Jan 23 
    +++ Public Notes 04/30/2023 18:19:18 +++Customer Input 04/30/2023 Requesting to send a technicial, we could not bring them up due to the same issue that was looked into in ticket 20092. We dont know if this is the same issue as the previous ticket, but need to know the reason Language Preference: English| '
    as text from dual
    union all 
    SELECT 102 as keyvalue,' Customer Input 05/15/2023 07:20:20 20424596 Reference to the above ticket+++Customer Input 04/30/2023 19:40:58 Plesae replace the item as this is a faulty one 
    +++ Public Notes 04/30/2023 18:19:18 +++Customer Input 04/30/2023 17:54:54 Shared the faulty machine pics for quick action Problem Context: When was the issue first observed? - 4/30, 1AM or so Were there any recent changes 
    or maintenance performed? - Language Preference: English| '
    as text from dual  

  ) t
CONNECT BY instr(text, 'Customer Input', 1, level - 1) > 0
order by keyvalue;
keyvalue pos text
101 1 Customer Input 05/15/2023 07:20:20 My name is ABX
101 2 Customer Input 04/30/2023 19:40:58 I have issue related to water purifier purchased on Jan 23
101 3 Customer Input 04/30/2023 Requesting to send a technicial, we could not bring them up due to the same issue that was looked into in ticket 20092. We dont know if this is the same issue as the previous ticket, but need to know the reason Language Preference: English
101 4
102 1 Customer Input 05/15/2023 07:20:20 20424596 Reference to the above ticket
102 2 Customer Input 04/30/2023 19:40:58 Please replace the item as this is a faulty one
102 3 Customer Input 04/30/2023 17:54:54 Shared the faulty machine pics for quick action Problem Context: When was the issue first observed? - 4/30, 1AM or so Were there any recent changes or maintenance performed? - Language Preference: English
102 4

This is working fine since the text column is of character datatype.

But when I am running this below query (on the actual column which is of clob datatype)

SELECT distinct keyvalue, level pos, trim(regexp_substr(customer_input_info, 'Customer Input[^+++]*', 1, level)) str
  FROM (select 101 as keyvalue,to_clob('Customer Input 05/15/2023 07:20:20 20424596 Reference to the above ticket+++Customer Input 04/30/2023 19:40:58 Plesae replace the item as this is a faulty one 
    +++ Public Notes 04/30/2023 18:19:18 +++Customer Input 04/30/2023 17:54:54 Shared the faulty machine pics for quick action Problem Context: When was the issue first observed? - 4/30, 1AM or so Were there any recent changes 
    or maintenance performed? - Language Preference: English| ') as customer_input_info from dual) t
CONNECT BY instr(customer_input_info, 'Customer Input', 1, level - 1) > 0
order by 1

I am getting below error

ORA-00932: inconsistent datatypes: expected - got CLOB 00932. 00000 - "inconsistent datatypes: expected %s got %s" *Cause:
*Action: Error at Line: 44 Column: 38.

I can't make changes to inner sql query as the source table is of clob data type. What changes should I make to outer query.


Solution

  • The code you posted will get that error if the source text is a CLOB, whatever the length. The problem isn't the length itself, it's that each split row value is also a CLOB, and you can't use distinct with CLOBs.

    The use of distinct is often a sign that there's a deeper problem that's just covering up. Without it you do get duplicates, but that's a well-known issue with connect-by queries against multiple source rows, and will get progressively worse with more rows.

    You need to limit the connect-by to the same source row, which is simple assuming keyvalue is unique; but you also need to introduce a non-deterministic function call to prevent it ballooning the results, for example:

    CONNECT BY instr(text, 'Customer Input', 1, level - 1) > 0
    AND keyvalue = PRIOR keyvalue
    AND PRIOR dbms_random.value IS NOT NULL
    

    fiddle

    You might finder it easier to understand and maintain if you switch to using recursive subquery factoring instead of a hierarchical query:

    WITH r (keyvalue, text, pos, x) as (
      SELECT keyvalue, text, 1, trim(regexp_substr(text, 'Customer Input[^+++]*', 1, 1))
      FROM t
      UNION ALL
      SELECT keyvalue, text, pos + 1, trim(regexp_substr(text, 'Customer Input[^+++]*', 1, pos + 1))
      FROM r
      WHERE instr(text, 'Customer Input', 1, pos) > 0
    )
    SELECT keyvalue, pos, x
    FROM r
    order by keyvalue, pos;
    
    KEYVALUE POS X
    101 1 Customer Input 05/15/2023 07:20:20 My name is ABX
    101 2 Customer Input 04/30/2023 19:40:58 I have issue related to water purifier purchased on Jan 23
    101 3 Customer Input 04/30/2023 Requesting to send a technicial, we could not bring them up due to the same issue that was looked into in ticket 20092. We dont know if this is the same issue as the previous ticket, but need to know the reason Language Preference: English|
    101 4
    102 1 Customer Input 05/15/2023 07:20:20 20424596 Reference to the above ticket
    102 2 Customer Input 04/30/2023 19:40:58 Plesae replace the item as this is a faulty one
    102 3 Customer Input 04/30/2023 17:54:54 Shared the faulty machine pics for quick action Problem Context: When was the issue first observed? - 4/30, 1AM or so Were there any recent changes
        or maintenance performed? - Language Preference: English|
    102 4

    fiddle

    I've left it with the same stop condition, which generates a final null entry. You may want to revisit that, for either approach.