Search code examples
sql-serverunpivot

Converting single row into multiple rows based on values in columns


I have a table where each record represents a person and there are many columns used to indicate what events they attended:

CREATE TABLE EventAttendees
(
    Person VARCHAR(100),
    [Event A] VARCHAR(1),
    [Event B] VARCHAR(1),
    [Event C] VARCHAR(1)
)

INSERT INTO EventAttendees
SELECT 'John Smith','x',NULL,NULL
UNION
SELECT 'Jane Doe',NULL,'x','x'
UNION
SELECT 'Phil White','x',NULL,'x'
UNION
SELECT 'Sarah Jenkins','x','x','x'

Which looks like this for example:

SELECT * FROM Event Attendees

/---------------|---------|---------|---------\
| Person        | Event A | Event B | Event C |
|---------------|---------|---------|---------|
| John Smith    |    x    |   NULL  |   NULL  |
| Jane Doe      |   NULL  |    x    |    x    |
| Phil White    |    x    |   NULL  |    x    |
| Sarah Jenkins |    x    |    x    |    x    |
\---------------|---------|---------|---------/

I want to generate a list of who attended which events, so my desired output is:

/---------------|---------|
| Person        | Event   |
|---------------|---------|
| John Smith    | Event A |
| Jane Doe      | Event B |
| Jane Doe      | Event C |
| Phil White    | Event A |
| Phil White    | Event C |
| Sarah Jenkins | Event A |
| Sarah Jenkins | Event B |
| Sarah Jenkins | Event C |
\---------------|---------/

In reality I have many more than 3 events, but the above is for ease of explanation (This is not a homework question btw). As the Events might change in the future and I have no control over the data I am being passed, I really need a dynamic solution which can handle any number of possible event columns.

I'm assuming I can do something with UNPIVOT, but I just can't figure it out, or find a good example on SO or elsewhere to work from - can someone help?


Solution

  • Figured out the solution I was thinking of, but yes, it does require dynamic SQL to get the relevant column names to feed into the UNPIVOT:

    declare @sql varchar(max)
    set @sql = 
        'select Person, EventName
        from EventAttendees
        unpivot
        (
            Attended for EventName in (' + (select
                                            stuff((
                                                select ',' + QUOTENAME(c.[name])
                                                from sys.columns c
                                                join sys.objects o on c.object_id = o.object_id
                                                where o.[name] = 'EventAttendees'
                                                and c.column_id > 1
                                                order by c.[name]
                                                for xml path('')
                                            ),1,1,'') as colList) + ')
        ) unpiv
        where unpiv.Attended = ''x''
        order by Person, EventName'
    
    exec (@sql)
    

    In this example, I am making the assumption that the Event columns are from the second column in the table onwards, but obviously I could use some different logic within the subquery to identify the relevant columns if necessary.

    On my example data, this gives the desired result:

    /---------------------------\
    | Person        | EventName |
    |---------------|-----------|
    | Jane Doe      | Event B   |
    | Jane Doe      | Event C   |
    | John Smith    | Event A   |
    | Phil White    | Event A   |
    | Phil White    | Event C   |
    | Sarah Jenkins | Event A   |
    | Sarah Jenkins | Event B   |
    | Sarah Jenkins | Event C   |
    \---------------------------/
    

    I think I prefer this to using a cursor, although I haven't actually confirmed what performance difference (if any) there is between the two dynamic approaches.

    Thanks for everyone's help and suggestions on this question though, greatly appreciated as always!