So I have 169 columns which have been treated to leave 1=for yes and 0= for no, now I need to aggregate the 2 million rows by mean, and the round that results to the nearest int, how could I get that?
The image is just showing that the values per column are either 0 or 1
If data
is your dataframe, you can get the mean of all the columns as integers simply with:
data.mean().astype(int) # Truncates mean to integer, e.g. 1.95 = 1
or, as of version 0.17.0
:
data.mean().round(0) # Rounds mean to nearest integer, e.g. 1.95 = 2 and 1.05 = 1