Search code examples
regexoverfloworacle12clistagg

LISTAGG function: Short string return ORA-01489: result of string concatenation is too long


I have a query to extract the email domain from a column that maintained with multiple email addresses. The column value is aaa@yahoo.com, bbb@hotmail.com, and the column size is VARCHAR2(3500 CHAR)

Below query used listagg function to extract the domains and prefixed with regular expression:

SELECT DISTINCT
LISTAGG( CHR(91)||'a-zA-Z0-9._%-'||CHR(93)||'+' ||REGEXP_SUBSTR(EMAIL_STRING,'@' || CHR(91) || '^,' 
|| CHR(93) ||'+', 1, LEVEL), ',')
WITHIN GROUP (ORDER BY LEVEL) AS  USER_EMAILS_DOMAIN
FROM EMAIL_PARAMETER WHERE NAME='UserEmails'
CONNECT BY REGEXP_SUBSTR(EMAIL_STRING,'@'||CHR(91)||'^,'||CHR(93)||'+', 1, LEVEL) IS NOT NULL;

Expected outcome: [a-zA-Z0-9.%-]+@yahoo.com,[a-zA-Z0-9.%-]+@hotmail.com

The string is within 4000 characters, however Oracle return ORA-01489: result of string concatenation is too long

I tried to use LISTAGG with ON OVERFLOW TRUNCATE option, it returns a long string with the second email, bbb@hotmail.com keeps on repeating.

I also try on XMLTAGG, also I'm getting a very long string with both emails are repeating.

select rtrim(xmlagg(xmlelement(e,CHR(91)||'a-zA-Z0-9._%-'||CHR(93)||'+' || EMAIL_STRING,',').extract('//text()') order by REGEXP_SUBSTR(EMAIL_STRING,'@' || CHR(91) || '^,' || CHR(93) ||'+', 1, LEVEL)).getclobval(),',') USER_EMAILS_DOMAIN
FROM EMAIL_PARAMETER WHERE NAME='UserEmails' 
CONNECT BY REGEXP_SUBSTR(EMAIL_STRING,'@'||CHR(91)||'^,'||CHR(93)||'+', 1, LEVEL) IS NOT NULL;

Solution

  • It looks as if DISTINCT doesn't do what you hoped it would.

    First select all distinct domains, then apply listagg to the result.

    Simplified:

    select listagg(domain, ',') within group (order by domain) final_result  --> 2. aggregation
    from 
         (select distinct regexp_substr(email_string, ...) as domain         --> 1. distinct domains
          from email_parameter
         )