Search code examples
pythonpandasdataframecounterdata-analysis

How to groupby count across multiple columns in pandas


I have the following sample dataframe in Python pandas:

+---+------+------+------+
|   | col1 | col2 | col3 |
+---+------+------+------+
| 0 |   a  |   d  |   b  |
+---+------+------+------+
| 1 |   a  |   c  |   b  |
+---+------+------+------+
| 2 |   c  |   b  |   c  |
+---+------+------+------+
| 3 |   b  |   b  |   c  |
+---+------+------+------+
| 4 |   a  |   a  |   d  |
+---+------+------+------+

I would like to perform a count of all the 'a,' 'b,' 'c,' and 'd' values across columns 1-3 so that I would end up with a dataframe like this:

+---+--------+-------+
|   | letter | count |
+---+--------+-------+
| 0 |    a   |   4   |
+---+--------+-------+
| 1 |    b   |   5   |
+---+--------+-------+
| 2 |    c   |   4   |
+---+--------+-------+
| 3 |    d   |   2   |
+---+--------+-------+

One way I can do this is stack the columns on top of each other and THEN do a groupby count, but I feel like there has to be a better way. Can someone help me with this?


Solution

  • You can stack() the dataframe to put all columns into rows and then do value_counts:

    df.stack().value_counts()
    
    b    5
    c    4
    a    4
    d    2
    dtype: int64