i want to create a function that calculte the rate of missing values of a data frame's column.
here's my code :
Pourcentage_NA = function(df, col_Name){
res=100*( length(df[col_Name])-length(na.omit(df[col_Name])) ) / length(df[col_Name])
}
when I call it like this : x = Pourcentage_NA( Data , "A" ) It shows that x=0 although I know there's some missing values. Anyone can help me pls ?
I try to change the formula but it keep saying the same thing
You can do:
Pourcentage_NA = function(df, col_Name){
100 * sum(is.na(df[col_Name])) / nrow(df)
}
Testing, we have:
dat <- data.frame(A = c(1, NA, 3, NA), B = c(NA, 2, 3, 4))
Pourcentage_NA(dat, "A")
#> [1] 50
Pourcentage_NA(dat, "B")
#> [1] 25
An alternative would be
Pourcentage_NA <- function(df, col_Name) 100 * colMeans(is.na(df))[col_Name]
Pourcentage_NA(dat, "A")
#> A
#> 50
Pourcentage_NA(dat, "B")
#> B
#> 25
Created on 2022-11-13 with reprex v2.0.2