Search code examples
rdatabasefunctionparameters

Passing col_names of a data frame in a function IN R


i want to create a function that calculte the rate of missing values of a data frame's column.

here's my code :

Pourcentage_NA = function(df, col_Name){
  
  
  res=100*( length(df[col_Name])-length(na.omit(df[col_Name])) ) / length(df[col_Name])
}

when I call it like this : x = Pourcentage_NA( Data , "A" ) It shows that x=0 although I know there's some missing values. Anyone can help me pls ?

I try to change the formula but it keep saying the same thing


Solution

  • You can do:

    Pourcentage_NA = function(df, col_Name){
      100 * sum(is.na(df[col_Name])) / nrow(df)
    }
    

    Testing, we have:

    dat <- data.frame(A = c(1, NA, 3, NA), B = c(NA, 2, 3, 4))
    
    Pourcentage_NA(dat, "A")
    #> [1] 50
    Pourcentage_NA(dat, "B")
    #> [1] 25
    

    An alternative would be

    Pourcentage_NA <- function(df, col_Name) 100 * colMeans(is.na(df))[col_Name]
    
    Pourcentage_NA(dat, "A")
    #>  A 
    #> 50
    Pourcentage_NA(dat, "B")
    #>  B 
    #> 25
    

    Created on 2022-11-13 with reprex v2.0.2