Search code examples
sqlsql-servert-sqlleft-join

Concatenating multiple values from a field to appear in a single cell


There are 3 tables involved - PEOPLE (learners) and PEOPLE_UNITS (enrolments), and UNIT_INSTANCE_OCCURRENCES (course detail).

 select distinct

 p.PERSON_CODE "upn",
 p.FORENAME "firstName",
 p.SURNAME "lastName",
 pu.UNIT_INSTANCE_CODE "Classes"

 from PEOPLE p   
 inner join  PEOPLE_UNITS pu    on pu.person_code = p.person_code           
 inner join UNIT_INSTANCE_OCCURRENCES uio   on uio.uio_id = pu.uio_id

 where pu.CALOCC_CODE = 2324
 and pu.UNIT_TYPE = 'R'
 and pu.PROGRESS_STATUS in ('A','F')
 and uio.status = 'ACTIVE'
 and p.FES_STAFF_CODE is NULL
 and p.IS_ACTIVE = 'Y'
 and uio.OWNING_ORGANISATION = 'BSK'

 order by p.SURNAME, p.FORENAME

The above is a basic version of the query I'm using (additional fields which aren't relevant to the issue have been removed), and will return one row per learner when the learner has only a single course (class) enrolment. The issue comes when there are multiple enrolments per learner.

See below the first 10 rows of the output. You can see that for learners with upn 511812 and 472479 there is just a single row as they only have one class enrolment. The other learners all appear on multiple rows as they are enrolled on more than one class.

upn firstName lastname Classes
511812 Amal Abdelkerim BS4208LG
516708 Adil Abdellatief BS0071LG
516708 Adil Abdellatief BS0082LG
511994 Shabaaz Abdool-Carrim BS0081LG
511994 Shabaaz Abdool-Carrim BS0082LG
417436 Asma Abid BS0085PS
417436 Asma Abid BS0139LG
516236 Widad Abouchakra BS0071LG
516236 Widad Abouchakra BS4208LG
472479 Doaa Aboudallah BS0139LG

I require the data to appear as below, with the class enrolments concatenated with a semicolon separator.

upn firstName lastname Classes
511812 Amal Abdelkerim BS4208LG
516708 Adil Abdellatief BS0071LG;BS0082LG
511994 Shabaaz Abdool-Carrim BS0081LG;BS0082LG
417436 Asma Abid BS0085PS;BS0139LG
516236 Widad Abouchakra BS0071LG;BS4208LG
472479 Doaa Aboudallah BS0139LG

I have attempted using multiple subqueries which point to the people_units table, with each stipulating that the enrolment returned should not be equal to the previous one. However this resulted in the duplication of rows.
I also attempted a 'Stuff' command, but this retuned values in the classes field which don't exist as course codes in our system.


Solution

  • Just because you added some testdata:

    select upn, firstname, lastname, STRING_AGG(classes, ';') WITHIN GROUP (ORDER BY Classes)
    from (
        VALUES  (511812, N'Amal', N'Abdelkerim', N'BS4208LG')
        ,   (516708, N'Adil', N'Abdellatief', N'BS0071LG')
        ,   (516708, N'Adil', N'Abdellatief', N'BS0082LG')
        ,   (511994, N'Shabaaz', N'Abdool-Carrim', N'BS0081LG')
        ,   (511994, N'Shabaaz', N'Abdool-Carrim', N'BS0082LG')
        ,   (417436, N'Asma', N'Abid', N'BS0085PS')
        ,   (417436, N'Asma', N'Abid', N'BS0139LG')
        ,   (516236, N'Widad', N'Abouchakra', N'BS0071LG')
        ,   (516236, N'Widad', N'Abouchakra', N'BS4208LG')
        ,   (472479, N'Doaa', N'Aboudallah', N'BS0139LG')
    ) t (upn,firstName,lastname,Classes)
    group by upn, firstname, lastname
    ORDER BY lastname, firstname, upn
    

    STRING_AGG allows aggregating the strings into a CSV.

    If you're on older sql server you can use a bit more obtuse version:

    WITH DATA AS (
        SELECT  *
        FROM    (
        VALUES  (511812, N'Amal', N'Abdelkerim', N'BS4208LG')
        ,   (516708, N'Adil', N'Abdellatief', N'BS0071LG')
        ,   (516708, N'Adil', N'Abdellatief', N'BS0082LG')
        ,   (511994, N'Shabaaz', N'Abdool-Carrim', N'BS0081LG')
        ,   (511994, N'Shabaaz', N'Abdool-Carrim', N'BS0082LG')
        ,   (417436, N'Asma', N'Abid', N'BS0085PS')
        ,   (417436, N'Asma', N'Abid', N'BS0139LG')
        ,   (516236, N'Widad', N'Abouchakra', N'BS0071LG')
        ,   (516236, N'Widad', N'Abouchakra', N'BS4208LG')
        ,   (472479, N'Doaa', N'Aboudallah', N'BS0139LG')
        ) t (upn,firstName,lastname,Classes)
    )
    
    select upn, firstName, lastname
    , STUFF((SELECT ';' + classes from DATA d WHERE d.upn = dd.upn order by classes FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
    from DATA dd
    group by upn, firstname, lastname
    order by lastname, firstname, upn
    

    The XML Path thing creates an fake xml string that starts with ; and contains list of classes. To remove the starting ';', one usually uses STUFF function