Search code examples
rirr

Kappam.light from irr package in R: Warning sqrt(varkappa), NAns produced, kappa = NA, z-value=NA and p-value=NA


I'm trying to calculate the inter-observer reliability in R for a scoring system using Light's kappa provided by the irr package. It's a fully crossed design in which fifteen observers scored 20 subjects for something being present ("1") or something not being present ("0"). This is my data frame (imported from an excel sheet):

library(irr)       
my.df #my dataframe

   a b c d e f g h i j k l m n o
1  0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
2  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3  0 0 0 0 0 0 0 0 0 0 1 0 0 1 0
4  0 1 1 0 0 0 1 0 0 0 0 0 0 0 0
5  0 1 0 0 1 1 0 0 0 1 1 0 0 1 0
6  0 1 0 0 1 1 0 0 0 0 0 1 1 0 0
7  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
8  0 0 1 0 0 0 0 0 0 1 0 0 0 0 0
9  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
11 0 1 1 1 0 1 0 0 0 1 0 0 0 0 1
12 0 1 0 0 0 1 0 1 0 1 0 0 1 0 0
13 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
14 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0
15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
16 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
17 0 1 0 1 1 1 0 0 0 0 0 1 1 1 0
18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
20 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0

Next I try and calculate the kappa value and I get the following response

kappam.light(my.df) #calculating the kappa-value

Light's Kappa for m Raters

 Subjects = 20 
   Raters = 15 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

Warning messages:
1: In sqrt(varkappa) : NaNs produced
2: In sqrt(varkappa) : NaNs produced
3: In sqrt(varkappa) : NaNs produced
4: In sqrt(varkappa) : NaNs produced
5: In sqrt(varkappa) : NaNs produced
6: In sqrt(varkappa) : NaNs produced
7: In sqrt(varkappa) : NaNs produced
8: In sqrt(varkappa) : NaNs produced
9: In sqrt(varkappa) : NaNs produced
10: In sqrt(varkappa) : NaNs produced

I already tried changing the class of all the variables to factors, characters, numeric, boolean. Nothing works. I suspect it has something to do with the relatively low numbers of "1" scores. Any suggestions?

EDIT: I found a solution to the problem, without having to exclude data. To calculate a prevalence and bias adjusted kappa, the pabak can be used for birater problems. For multirater problems like this you should use Randolph's kappa. This is based on the fleiss' kappa and therefore does not take variance in consideration. Ideal for the problem I had.

An online calculator can be found here In R, the Raters package can be used. I've compared the outcome between the two methods, and the results are virtually the same (a difference in the sixth decimal).


Solution

  • You are getting this error because you have no variability in the columns a and i.

    First, check the variability across the columns

    apply(df,2,sd)
            a         b         c         d         e         f         g         h         i         j         k         l         m         n         o 
    0.0000000 0.5104178 0.3663475 0.4103913 0.3663475 0.4893605 0.3077935 0.2236068 0.0000000 0.4701623 0.3663475 0.4103913 0.4103913 0.4103913 0.2236068 
    

    You see that columns a and i have no variability. Variability is needed because Kappa calculates the inter-rater reliability and corrects for chance agreement. With two unknowns, and no variability this can't be calculated.

    Therefore, you get output without errors if you remove these 2 columns.

    df$a=NULL
    df$i=NULL
    kappam.light(df)
     Light's Kappa for m Raters
    
     Subjects = 20 
       Raters = 13 
        Kappa = 0.19 
    
            z = 0 
      p-value = 1