I am sure this is a super simple thing, but I cannot find a really quick and easy solution.
I have patient data with a lot of columns in a format like this:
patID disease category ...
1 1 A
2 0 B
3 1 C
4 1 B
How can I quickly produce a summary table, which includes the number of observations for each column/variable in the dataframe? The result should be something like this:
VARIABLE Number of rows
disease:1 3
disease:0 1
category:A 1
category:B 2
category:C 1
...
I know I can do this for a single variable by just using table(data$column). But how can I produce something similar for all columns in a dataframe?
Using tidyr
and dplyr
:
gather(data, variable, value, -patID) %>%
count(variable, value)
(Thanks @Frank for reminding me about tally
and count
.)