Search code examples
rdplyrrows

How to use prop.table for multiple rows


I have a very large data set (around 45,000 obs.) and would like to determine the proportion of errors ("1") and correct actions ("0") that individuals produce for various categories. Each row is for a different individual and an individual may appear more than once.

The data set looks like this:

 Type 1     Type 2   Individual
   1          0          T1
   0          0          T4
   0          1          T5
   0          0          T1
   1          1          T1
   0          1          T1
   1          1          T3

I want to use the prop.table function but I can only seem to find the errors and correct actions for the entire dataset OR for each individual relative to the entire dataset. So far, I tried:

prop.table(table(SourcePop$error)
prop.table(table(SourcePop$error, SourcePop$individual)

I want to find these proportions for each individual relative only to themself (i.e. looking at proportion for T1 only, T2 only, etc.). I am a huge R beginner so any help is greatly appreciated. Thanks.


Solution

  • Assuming that column error refers to columns Type 1 or Type 2 of the exmaple data the proportions by group or individual can be computed by specifing the margin argument.

    SourcePop <- read.table(text=" 'Type 1'     'Type 2'   Individual
       1          0          T1
       0          0          T4
       0          1          T5
       0          0          T1
       1          1          T1
       0          1          T1
       1          1          T3", header = TRUE)
    
    prop.table(table(SourcePop$Type.1))
    #> 
    #>         0         1 
    #> 0.5714286 0.4285714
    
    prop.table(table(SourcePop$Type.1, SourcePop$Individual), margin = 2)
    #>    
    #>      T1  T3  T4  T5
    #>   0 0.5 0.0 1.0 1.0
    #>   1 0.5 1.0 0.0 0.0
    prop.table(table(SourcePop$Type.2, SourcePop$Individual), margin = 2)
    #>    
    #>      T1  T3  T4  T5
    #>   0 0.5 0.0 1.0 0.0
    #>   1 0.5 1.0 0.0 1.0
    

    Created on 2020-03-28 by the reprex package (v0.3.0)