Search code examples
rconceptual

Why does min/max/sum(c(NA, 4, 5), na.rm = "xyz") work while mean() with same inputs doesn't?


I would like to understand why sum/min/max functions in R interpret a character string as TRUE when supplied to na.rm, while mean() does not.

My uneducated guess is that as.logical("xyz") returns NA, which is being supplied to na.rm as the argument, which for some strange reason is accepted as TRUE for sum/min/max while it isn't for mean()

The expected output for sum(c(NA, 4, 5), na.rm = "xyz") is an argument is not interpretable as logical error (returned from a mean). I don't understand why that isn't the case.


Solution

  • As far as mean is concerned it is quite straightforward. As @Rich Scriven mentions if you type mean.default in the console you see a section of code

    if (na.rm) 
       x <- x[!is.na(x)]
    

    which gives you the error.

    mean(1:10, na.rm = "abc") #gives
    

    Error in if (na.rm) x <- x[!is.na(x)] : argument is not interpretable as logical

    which is similar to doing

    if ("abc") "Hello"
    

    Error in if ("abc") "Hello" : argument is not interpretable as logical


    Now regarding sum, min, max and other primitive functions which is implemented in C. The source code of these functions is here. There is a parameter Rboolean narm passed into the function.

    The way C treats boolean is different.

    #include <stdio.h>
    #include <stdbool.h>
    
    int main()
    {
      bool a = "abc";
      if (a)
        printf("Hello World");
      else
        printf("Not Hello World");
      return 0;
    }
    

    If you run the above C code it will print "Hello World". Run the demo here. If you pass a string input to boolean type it is considered as TRUE in C. In fact that is even true with numbers as well

    sum(1:10, na.rm = 12)
    

    works as well.

    PS - I am no expert in C and know a little bit of R. Finding all these insights took lot of time. Let me know if I have misinterpreted something and provided any false information.