I am using Azure SQL Database. I have a table in the following structure, to track which people are working in various posts at various times. To provide a simplified version:
CREATE TABLE PersonWorkload
(
WorkloadID INT PRIMARY KEY IDENTITY,
Post VARCHAR(30),
Period VARCHAR(12),
PersonID INT,
GroupID INT,
Notes VARCHAR(100)
)
And here's a sample of data
INSERT INTO PersonWorkload (Post, Period, PersonID, GroupID, Notes)
VALUES
('Group Leader', 'Q1', 1, 1, NULL),
('Group Leader', 'Q2', 1, 1, NULL),
('Group Leader', 'Q3', 1, 2, NULL),
('Group Leader', 'Q4', 1, 2, NULL),
('Security', 'Q1', 3, 3, NULL),
('Security', 'Q2', 3, 3, NULL),
('Security', 'Q3', 4, 4, NULL),
('Security', 'Q4', 4, 4, NULL),
('Team member', 'Q1', 7, 5, NULL),
('Team member', 'Q2', 7, 5, 'Leaving programme this qtr'),
('Team member', 'Q3', 13, 6, NULL),
('Team member', 'Q4', 13, 6, NULL),
('Team member', 'Q1', 8, 7, NULL),
('Team member', 'Q2', 8, 7, NULL),
('Team member', 'Q3', 8, 7, NULL),
('Team member', 'Q4', 8, 7, NULL),
('Team member','Q1',9,8,'Temp posting'),
('Team member', 'Q2', 10, 9, NULL),
('Team member', 'Q3', 10, 9, NULL),
('Team member', 'Q4', 10, 9, NULL),
('Comms','Q2',11,10,NULL),
('Comms','Q3',11,10,NULL),
('Comms','Q4',11,10,NULL)
The GroupID column is used because sometimes I need to treat a set of these posts together. A new person in a post will always trigger a new groupID, but sometimes a person can be in a post for a longer period, and that period can sometimes contain more than one groupID (so if a person is in a post for a year, it might be that Jan - June have one GroupID, and July to Dec a different GroupID).
Sometimes only one person will hold a post at a time, sometimes more than one, and sometimes no one will be in a post.
I need to pivot this information around in order to show a column for each period, along with who worked in each post. It may be that there are one or more rows for each post.
I've created a SQL PIVOT query to achieve this, and I've put the results into a temporary table.
Here's an example of what the temporary table looks like:
CREATE TABLE SampleData (
Post VARCHAR(100)
,GroupID INT
,[Q1] INT
,[Q2] INT
,[Q3] INT
,[Q4] INT
)
...and the statement to populate it:
INSERT INTO SampleData
SELECT *
FROM
(SELECT Post, Period, PersonID, GroupID
FROM PersonWorkload) t
PIVOT
(MAX(t.PersonID)
FOR t.Period IN ([Q1],[Q2],[Q3],[Q4])
) pvt
If I include the groupID I end up with data from my PIVOT query which looks like this:
Post | GroupID | Q1 | Q2 | Q3 | Q4 |
---|---|---|---|---|---|
Group Leader | 1 | 1 | 1 | NULL | NULL |
Group Leader | 2 | NULL | NULL | 1 | 1 |
Security | 3 | 3 | 3 | NULL | NULL |
Security | 4 | NULL | NULL | 4 | 4 |
Team member | 5 | 7 | 7 | NULL | NULL |
Team member | 6 | NULL | NULL | 13 | 13 |
Team member | 7 | 8 | 8 | 8 | 8 |
Team member | 8 | 9 | NULL | NULL | NULL |
Team member | 9 | NULL | 10 | 10 | 10 |
Comms | 10 | NULL | 11 | 11 | 11 |
This is close to what I want - except that I want to "collapse" the rows down, to look more like this - which shows that sometimes (top row) a person is in post all year, sometimes the post is filled by someone for the first couple of periods, then someone else comes in (2nd row), sometimes multiple people hold a post at the same time (Team members) and so on.
Post | Q1 | Q2 | Q3 | Q4 |
---|---|---|---|---|
Group Leader | 1 | 1 | 1 | 1 |
Security | 3 | 3 | 4 | 4 |
Team member | 7 | 7 | 13 | 13 |
Team member | 8 | 8 | 8 | 8 |
Team member | 9 | 10 | 10 | 10 |
Comms | NULL | 11 | 11 | 11 |
So, the obvious thing to do is to remove the GroupID, which I'm not needing for this report anyway. But, if I do that, the data "over-collapses" (sorry - don't know how better to express it) and we lose all except the lowest person ID.
Post | Q1 | Q2 | Q3 | Q4 |
---|---|---|---|---|
Comms | NULL | 11 | 11 | 11 |
Group Leader | 1 | 1 | 1 | 1 |
Security | 3 | 3 | 4 | 4 |
Team member | 9 | 10 | 13 | 13 |
Now I've lost, for example, person IDs 7 and 8 as team members, because they're not the max personID.
Is there a way to do this? I've been working on it for 2 days now, and can't get what I need.
This is a simplified version - in practice there are about 40 posts, and the report would need to have 15 time periods.
Any help gratefully received.
Thanks
As mentioned by @lptr in the comments:
You can add a row-numbering column to the pivot, to ensure that unique rows do not get merged.
SELECT *
FROM (
SELECT
Post,
Period,
PersonID,
dense_rank() over (partition by Post, Period order by GroupID, PersonID) as rn
FROM PersonWorkload
) t
PIVOT
(MAX(t.PersonID)
FOR t.Period IN ([Q1],[Q2],[Q3],[Q4])
) pvt;
ROW_NUMBER()
should also work here