Search code examples
rstatisticsfrequency

Frequency of multiple boolean or non-boolean columns in R


I am a novice to R. I have a data frame (imported with read.csv) with >200 columns and >100 rows which are the result of a survey. So, a column or a groups of columns represent the answers to questions. I have two questions.

a) columns with names "Q1", "Q2", ... "Q9" contain booleans (yes/no). What is the command for creating a frequency table which looks like this (i.e.: frequency of true/false for each column over all rows).

        q1     q2    q3    ...
true    5      99     11
false   95      1     89

b) columns with names "P1", "P2", ... "P9" contain values from a scale ("agree"..."don't agree") from 1..5. What is the command for creating a frequency table which looks like this (i.e. count number of occurrences of 1, 2, ...5 over all rows for each column)

        p1      p2     p3  ....
1        1        4     5
2        4       45     7
3       78       34     6
4        5       55     8
5        4       22    67  ....

Solution

  • Data:

    df = data.frame(q1=c(F,T,T),q2=c(T,F,F),q3=rep(T,3), p1=c(1,2,1), p2=c(3,4,5), p3=c(4,4,2))
    

    You can try:

    library(qdapTools)
    t(mtabulate(df[grep('q',names(df), value=T)]))
    
    #      q1 q2 q3
    #FALSE  1  2  0
    #TRUE   2  1  3
    
    t(mtabulate(df[grep('p',names(df), value=T)]))
    #  p1 p2 p3
    #1  2  0  0
    #2  1  0  1
    #3  0  1  0
    #4  0  1  2
    #5  0  1  0