I have a large data set with 11 columns and 100000 rows (for example) in which i have values 1,2,3,4. Where 4 is a missing value. What i need is to compute the Mode. I am using following data and function
ac<-matrix(c("4","4","4","4","4","4","4","3","3","4","4"), nrow=1, ncol=11)
m<-as.matrix(apply(ac, 1, Mode))
if i use the above command then it will give me "4" as the Mode, which i do not need. I want that the Mode will omit 4 and display "3" as Mode, because 4 is a missing value.
Thanks in advance.
R has a powerful mechanism to work with missing values. You can represent a missing value with NA
and many of the R functions have support for dealing with NA
values.
Create a small matrix with random numbers:
set.seed(123)
m <- matrix(sample(1:4, 12, replace=TRUE), ncol=3)
m
[,1] [,2] [,3]
[1,] 2 4 3
[2,] 4 1 2
[3,] 2 3 4
[4,] 4 4 2
Since you represent missingness by the value 4, you can replace each occurrence by NA
:
m[m==4] <- NA
m
[,1] [,2] [,3]
[1,] 2 NA 3
[2,] NA 1 2
[3,] 2 3 NA
[4,] NA NA 2
To calculate, for example, the mean:
mean(m[1, ], na.rm=TRUE)
[1] 2.5
apply(m, 1, mean, na.rm=TRUE)
[1] 2.5 1.5 2.5 2.0
To calculate the mode, you can use the function Mode
in package prettyR
: (Note that in this very small set of data, only the 4th row has a unique modal value:
apply(m, 1, Mode, na.rm=TRUE)
[1] ">1 mode" ">1 mode" ">1 mode" "2"