Python: Only 2 unique column names in dataframe, 3105 columns total. How to get average of row, grouped by unique column name

My dataframe is in the linked image. Basically to make it simple, my dataframe currently looks something like this:

Gene	Cell_A	Cell_B	Cell_B	Cell_B	Cell_A
Gene_A	0	4	35.5	4.5	3.5
Gene_B	1.3	52	3.4	2.4	0
Gene_C	2.3	3.3	32	0	2

And there are 3105 columns of Cell_A and Cell_B combined. There are around 13k (I think?) rows of genes. What I want to do is get the average number per gene (row), grouped by the unique column name. So in the end I would have just 2 columns, Cell_A and Cell_B, with the average number (per gene, i.e. row) as data.

I expect that it has to do something with either agg or groupby. But I have no idea where to even start with this. If you can offer some guidance I would be very grateful!

Solution

You are right, you want to group by columns and do the mean operation.

First, preserve the first column as an index:

df = df.set_index(['Gene'])

Then do

df.groupby(by=df.columns, axis=1).mean()