Search code examples
rpretty-printt-test

How do I program tests in R so that they print nicely?


Statistical tests in R generate lists, but then when you call the test, the printing of these lists gives a special user-friendly structure to assist the reader. To see what I'm talking about, consider an example where you use the t.test function in the stats package.

#Run a T-test on some example data
X <- c(30, 32, 40, 28, 29, 35, 30, 34, 31, 39);
Y <- c(19, 20, 44, 45, 8, 29, 26, 59, 35, 50);
TEST <- stats::t.test(X,Y);

#Show structure of the TEST object
str(TEST);
List of 9
 $ statistic  : Named num -0.134
  ..- attr(*, "names")= chr "t"
 $ parameter  : Named num 10.2
  ..- attr(*, "names")= chr "df"
 $ p.value    : num 0.896
 $ conf.int   : num [1:2] -12.3 10.9
  ..- attr(*, "conf.level")= num 0.95
 $ estimate   : Named num [1:2] 32.8 33.5
  ..- attr(*, "names")= chr [1:2] "mean of x" "mean of y"
 $ null.value : Named num 0
  ..- attr(*, "names")= chr "difference in means"
 $ alternative: chr "two.sided"
 $ method     : chr "Welch Two Sample t-test"
 $ data.name  : chr "X and Y"
 - attr(*, "class")= chr "htest"

This object is a list with nine elements, some of which are named via attributes. However, when I print the TEST object, the returned information is structured in a different way than the standard printing of a list.

#Print the TEST object
TEST;

        Welch Two Sample t-test

data:  X and Y
t = -0.13444, df = 10.204, p-value = 0.8957
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -12.27046  10.87046
sample estimates:
mean of x mean of y 
     32.8      33.5 

As you can see, this printed output is much more user-friendly than the standard printing for a list. I would like to be able to program statistical tests in R which generate a list of outputs similar to the above, but which print in this user-friendly way.


My Questions: Why does R print the output of the list TEST in this special way? If I create a list of outputs of a statistical test (e.g., like the above), how can I set the object to print in this way?


Solution

  • This answer is put together from helpful comments and answers by other users, but I wanted to give an elaborated answer here to make things more explicit, for the benefit of users who are not already familiar with some of these issues. The object created by the t.test function is an object of class htest, and this type of object has a special method of printing under the print.htest setting in the global environment. That printing method draws out information from the list, but prints it in the user-friendly way you see in the output in the question.

    If you want to replicate this type of printing for a new statistical test that you are programming yourself, then you will need to structure your new test so that it outputs a htest object, with the required elements of the list, and the required class. Here is an example from another answer where a hypothesis test set out in Tarone (1979) is programmed as a htest object:

    Tarone.test <- function(N, M) {
    
        #Check validity of inputs
        if(any(M > N)) { stop("Error: Observed count value exceeds binomial trials"); }
    
        #Set hypothesis test objects
        method      <- "Tarone's Z test";
        alternative <- "greater";
        null.value  <- 0;
        attr(null.value, "names") <- "dispersion parameter";
        data.name   <- paste0(deparse(substitute(M)), " successes from ", 
                              deparse(substitute(N)), " counts");
    
        #Calculate test statistics
        estimate    <- sum(M)/sum(N);
        attr(estimate, "names") <- "proportion parameter";
    
        S           <- sum((M - N*estimate)^2/(estimate*(1 - estimate)));
        statistic   <- (S - sum(N))/sqrt(2*sum(N*(N-1))); 
        attr(statistic, "names") <- "z";
    
        p.value     <- 2*pnorm(-abs(statistic), 0, 1);
        attr(p.value, "names") <- NULL;
    
        #Create htest object
        TEST        <- list(statistic = statistic, p.value = p.value, estimate = estimate, 
                            null.value = null.value, alternative = alternative, 
                            method = method, data.name = data.name);
        class(TEST) <- "htest";
    
        TEST; }
    

    In this example, the function calculates all the required elements of the htest object and then creates this object as a list with that class. It is important to include the command class(TEST) <- "htest" in the code, so that the object created is not just a regular list. Inclusion of that command will ensure that the output object is of the proper class, and so it will print in a user-friendly way. To see this, we can generate some data and apply the test:

    #Generate example data
    N <- c(30, 32, 40, 28, 29, 35, 30, 34, 31, 39);
    M <- c( 9, 10, 22, 15,  8, 19, 16, 19, 15, 10);
    
    #Apply Tarone's test to the example data
    TEST <- Tarone.test(N, M);
    TEST;
    
            Tarone's Z test
    
    data:  M successes from N counts
    z = 2.5988, p-value = 0.009355
    alternative hypothesis: true dispersion parameter is greater than 0
    sample estimates:
    proportion parameter 
               0.4359756
    

    Here we see that our newly created hypothesis-testing function gives us output that has a similar user-friendly structure to the t.test. In this example we have given different names to the testing method and the elements of the test, and these appear in the descriptive output when printed.