Search code examples
oracle-databaseconcatenation

Capture values from multiple columns and apply trim and replace


I want to concatenate the strings from different columns in Oracle but some columns have value and others don’t have. What I want is that if the value of the last 3 characters of C1 matches to the first 3 characters of the value of the C2 then concatenate with dash(-) then drop the 2 second match/occurence or else concatenate with underscore(_) both 3 characters retain and so with the next columns then ignore the null ones.

Dataset:

Sno C1      C2      C3      C4
1   ABC-DEF DEF-ABC
2   DEF-ABC XYZ-DEF ABC-DEF
3   XYZ-DEF DEF-ABC ABC-DEF GHI-XYZ

Result:

SNo Concatenated_Data
1   ABC-DEF-ABC
2   DEF-ABC_XYZ-DEF_ABC-DEF
3   XYZ-DEF-ABC-DEF_GHI-XYZ

I saw this approach, it concatenates all the columns with value but the underscore does not capture if last 3 chars of C1 and first 3 chars of C2 do not match. Anyone can update the select statement?

SELECT Sno,TRIM(BOTH '-' FROM
     REGEXP_REPLACE(
       REGEXP_REPLACE(
         C1 || '-' || C2 || '-' || C3 || '-' || C4,
         '-{2,}',
         '-'
       ),
       '([^-_]{3})-\1',
       '\1'
     )
   ) AS Concatenated_Data
FROM   t1

Solution

  • Assuming that your columns will be filled with values from the left and have NULL values on the right (and not NULL values between two non-NULL values) then you can use a CASE expression to compare the last 3 characters of the previous term to the first 3 characters of the current term and concatenate based upon that:

    SELECT sno,
           c1
           || CASE
              WHEN c2 IS NULL
              THEN NULL
              WHEN SUBSTR(c1, -3) = SUBSTR(c2, 1, 3)
              THEN SUBSTR(c2, 4)
              ELSE '_' || c2
              END
           || CASE
              WHEN c3 IS NULL
              THEN NULL
              WHEN SUBSTR(c2, -3) = SUBSTR(c3, 1, 3)
              THEN SUBSTR(c3, 4)
              ELSE '_' || c3
              END
           || CASE
              WHEN c4 IS NULL
              THEN NULL
              WHEN SUBSTR(c3, -3) = SUBSTR(c4, 1, 3)
              THEN SUBSTR(c4, 4)
              ELSE '_' || c4
              END AS combined
    FROM   table_name;
    

    Which, for the sample data:

    CREATE TABLE table_name (Sno, C1, C2, C3, C4) AS
    SELECT 1, 'ABC-DEF', 'DEF-ABC', NULL,      NULL      FROM DUAL UNION ALL
    SELECT 2, 'DEF-ABC', 'XYZ-DEF', 'ABC-DEF', NULL      FROM DUAL UNION ALL
    SELECT 3, 'XYZ-DEF', 'DEF-ABC', 'ABC-DEF', 'GHI-XYZ' FROM DUAL;
    

    Outputs:

    SNO COMBINED
    1 ABC-DEF-ABC
    2 DEF-ABC_XYZ-DEF_ABC-DEF
    3 XYZ-DEF-ABC-DEF_GHI-XYZ

    You could also use regular expressions, which may be less to type but is probably going to be less efficient than simple string functions:

    SELECT sno,
           REGEXP_REPLACE(
              c1
              || CASE WHEN c2 IS NULL THEN NULL ELSE '_' || c2 END
              || CASE WHEN c3 IS NULL THEN NULL ELSE '_' || c3 END
              || CASE WHEN c4 IS NULL THEN NULL ELSE '_' || c4 END,
              '([^-_]{3})_\1',
              '\1'
           ) AS combined
    FROM   table_name;
    

    Which outputs the same.

    fiddle