Search code examples
sqlpostgresqlpivot-table

Transform table to one-hot-encoding of single column value


I have a table with two columns:

+---------+--------+
| keyword | color  |
+---------+--------+
| foo     | red    |
| bar     | yellow |
| fobar   | red    |
| baz     | blue   |
| bazbaz  | green  |
+---------+--------+

I need to do some kind of one-hot encoding and transform table in PostgreSQL to:

+---------+-----+--------+-------+------+
| keyword | red | yellow | green | blue |
+---------+-----+--------+-------+------+
| foo     |   1 |      0 |     0 |    0 |
| bar     |   0 |      1 |     0 |    0 |
| fobar   |   1 |      0 |     0 |    0 |
| baz     |   0 |      0 |     0 |    1 |
| bazbaz  |   0 |      0 |     1 |    0 |
+---------+-----+--------+-------+------+

How do I make this transformation in SQL only?


Solution

  • If I correctly understand, you need conditional aggregation:

    select keyword,
    count(case when color = 'red' then 1 end) as red,
    count(case when color = 'yellow' then 1 end) as yellow
    -- another colors here
    from t
    group by keyword