Search code examples
rmode

How to compute conditional Mode in R?


I have a large data set with 11 columns and 100000 rows (for example) in which i have values 1,2,3,4. Where 4 is a missing value. What i need is to compute the Mode. I am using following data and function

ac<-matrix(c("4","4","4","4","4","4","4","3","3","4","4"), nrow=1, ncol=11)  

m<-as.matrix(apply(ac, 1, Mode))

if i use the above command then it will give me "4" as the Mode, which i do not need. I want that the Mode will omit 4 and display "3" as Mode, because 4 is a missing value.

Thanks in advance.


Solution

  • R has a powerful mechanism to work with missing values. You can represent a missing value with NA and many of the R functions have support for dealing with NA values.

    Create a small matrix with random numbers:

    set.seed(123)
    m <- matrix(sample(1:4, 12, replace=TRUE), ncol=3)
    m
         [,1] [,2] [,3]
    [1,]    2    4    3
    [2,]    4    1    2
    [3,]    2    3    4
    [4,]    4    4    2
    

    Since you represent missingness by the value 4, you can replace each occurrence by NA:

    m[m==4] <- NA
    m
    
         [,1] [,2] [,3]
    [1,]    2   NA    3
    [2,]   NA    1    2
    [3,]    2    3   NA
    [4,]   NA   NA    2
    

    To calculate, for example, the mean:

    mean(m[1, ], na.rm=TRUE)
    [1] 2.5
    
    apply(m, 1, mean, na.rm=TRUE)
    [1] 2.5 1.5 2.5 2.0
    

    To calculate the mode, you can use the function Mode in package prettyR: (Note that in this very small set of data, only the 4th row has a unique modal value:

    apply(m, 1, Mode, na.rm=TRUE)
    [1] ">1 mode" ">1 mode" ">1 mode" "2"