I have a factor variable and I want to retrieve a count of each level. This is easy with summary()
function:
> h <- rnorm(100, 170, 10)
> hf <- cut(h, breaks=10)
> summary(hf)
(142,147] (147,153] (153,158] (158,163] (163,169] (169,174] (174,180] (180,185] (185,190]
5 3 7 20 11 23 12 11 6
(190,196]
2
But I want this to be included in knitr report, so I would prefer a more human-friendly way of displaying data. The most obvious way is transposing it, so I get something like this:
(142,147] 5
(147,153] 3
(153,158] 7
(158,163] 20
(163,169] 11
(169,174] 23
(174,180] 12
(180,185] 11
(185,190] 6
(190,196] 2
The question is: what is the best way to achieve this?
(And by "the best" I mean "clean, efficient, compact and without any side effects")
Below I outline few ways I have tried and why I am not perfectly happy with any of these
> r <- as.data.frame(summary(hf))
> colnames(r) <- ""
> r
(142,147] 5
(147,153] 3
(153,158] 7
(158,163] 20
(163,169] 11
(169,174] 23
(174,180] 12
(180,185] 11
(185,190] 6
(190,196] 2
I don't like the fact that I use temporary variable to store data frame and one line of code just to suppress second column header (which reads summary(hf)
by default, and is not very helpful). If I could hide column name while converting summary to data.frame, or by using some printing function/argument, that would be perfect.
> as.data.frame(table(hf))
hf Freq
1 (142,147] 5
2 (147,153] 3
3 (153,158] 7
4 (158,163] 20
5 (163,169] 11
6 (169,174] 23
7 (174,180] 12
8 (180,185] 11
9 (185,190] 6
10 (190,196] 2
Here headers are more readable, but now I have unneeded row names. Which leads me to next solution.
> write.table(as.data.frame(table(hf)), col.names=FALSE, row.names=FALSE)
"(142,147]" 5
"(147,153]" 3
"(153,158]" 7
"(158,163]" 20
"(163,169]" 11
"(169,174]" 23
"(174,180]" 12
"(180,185]" 11
"(185,190]" 6
"(190,196]" 2
This is fine as long as factor level names has the same length. When they have different length, things start to be misaligned:
> write.table(as.data.frame(table(h>170)), col.names=FALSE, row.names=FALSE)
"FALSE" 51
"TRUE" 49
If anyone has read so far, let me repeat my question:
What is the best way to get number of occurrences of each factor level displayed in "transposed" table, possibly without any side-effects?
It seems like you simply want this:
setNames(as.data.frame(summary(hf)), "")
Of course you could also wrap your code in a function ...