Search code examples
sqlsql-serverpivotpivot-table

SQL Server transposing rows to columns


I need to transpose rows to columns to display data in different dimensions or meet table format requirements using SQL server.

Starting from this query and this return

SELECT 
  _p,
  _t,
  _x,
 COUNT (*) _q 
FROM
    [dbo].[subset_data] 
WHERE
    CONVERT ( VARCHAR ( 10 ), data, 120 ) = CONVERT ( VARCHAR ( 10 ), GETDATE( ), 120 ) 
    GROUP BY
    _p,
    _t,
    _x;


+--------+--------------+----------+--------+
| _p     | _t           |   _x     |   _q   |
+--------+--------------+----------+--------+
| 001    | 06:00-12:00  |   15     |    1   |
| 001    | 12:00-18:00  |   15     |    1   |
| 008    | 06:00-12:00  |    5     |    2   |
| 009    | 12:00-18:00  |    6     |    1   |
| 010    | 06:00-12:00  |   12     |    1   |
+--------+--------------+----------+--------+

I have to get this return

+--------+--------------+-------------+-------------+-------------+------------+
| _p     | 00:00-06:00  | 06:00-12:00 | 12:00-18:00 | 18:00-23:59 |    _x      |
+--------+--------------+-------------+-------------+-------------+------------+
| 001    |      0       |      1      |      1      |      0      |    15      |
| 008    |      0       |      2      |      0      |      0      |     5      |
| 009    |      0       |      0      |      1      |      0      |     6      |
| 010    |      0       |      1      |      0      |      0      |    12      |
+--------+--------------+-------------+-------------+-------------+------------+

Where the row values of column _t are to be converted to columns and for each value of the _p column I have to retrieve the value of the row from _q column.

Can you help me?

-- INIT database
CREATE TABLE subset_data (
   data DATETIME,
  _p VARCHAR(100),
  _t VARCHAR(100),
  _x VARCHAR(100),
);

INSERT INTO subset_data(data, _p, _t, _x) VALUES (GETDATE(), '001', '06:00-12:00', '15');
INSERT INTO subset_data(data, _p, _t, _x) VALUES (GETDATE(), '001', '12:00-18:00', '15' );
INSERT INTO subset_data(data, _p, _t, _x) VALUES (GETDATE(), '008', '06:00-12:00', '5' );
INSERT INTO subset_data(data, _p, _t, _x) VALUES (GETDATE(), '008', '06:00-12:00', '5' );
INSERT INTO subset_data(data, _p, _t, _x) VALUES (GETDATE(), '009', '12:00-18:00', '6');
INSERT INTO subset_data(data, _p, _t, _x) VALUES (GETDATE(), '010', '06:00-12:00', '12');

SQLFiddle

My attempt to use PIVOT is as follows:

select  
  _p, _t, _x
from
(
  select _p
  from [dbo].[subset_data] 
) d
pivot
(
  _p
  for _p in (_p, _t, _x)
) piv;

Error

> Msg 156, Level 15, State 1, Server IT000001861251, Procedure , Line 0
Incorrect syntax near the keyword 'for'.
> [42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Incorrect syntax near the keyword 'for'. (156)

Solution

  • The attempt you've made has a lot of flaws:

    1. Your derived table d only exposes the column _p, so _t, and _x aren't accessible.
    2. Your PIVOT lacks an aggregate function.
    3. You are trying to check the value of _p, but it _t you want to check
    4. You are checking if the column (_p) has the value '_p', '_t', or '_x' but those values never appear in that column, nor the column you actually want to pivot on (_t).
    5. You have (had) a trailing comma in your IN clause.

    The suggested duplicate, Efficiently convert rows to columns in sql server, shows you how to properly use PIVOT, which is in the format:

    SELECT <Columns List from Pivoted derived table>
    FROM (SELECT <Columns List>
          FROM <Table Name> --Your Table) <Alias>
    PIVOT (<Aggregate Expression>
           FOR <Column to PIVOT on> IN (<Delimited list of Values to PIVOT on>)) <Pivot Alias>;
    

    For your data, this results in the following:

    SELECT _p,
           [00:00-06:00],
           [06:00-12:00],
           [12:00-18:00],
           [18:00-23:59],
           _x
    FROM (SELECT _p,
                 _t,
                 _x
            FROM [dbo].[subset_data]) d
    PIVOT (COUNT(_t)
           FOR _t IN ([00:00-06:00],[06:00-12:00],[12:00-18:00],[18:00-23:59])) piv
    ORDER BY _p;
    

    Personally, as always, however, I suggest conditional aggregation; the syntax is less "clunky", it's less restrictive, and it's (more) transferable to other dialects:

    SELECT _p,
           COUNT(CASE _t WHEN '00:00-06:00' THEN 1 END) AS [00:00-06:00],
           COUNT(CASE _t WHEN '06:00-12:00' THEN 1 END) AS [06:00-12:00],
           COUNT(CASE _t WHEN '12:00-18:00' THEN 1 END) AS [12:00-18:00],
           COUNT(CASE _t WHEN '18:00-23:59' THEN 1 END) AS [18:00-23:59],
           _x
    FROM dbo.subset_data sd
    GROUP BY _p,
             _x
    ORDER BY _p;