Search code examples
rcontingency

How can you force inclusion of a level in a table in R?


Is there a way to force R's table function to include rows or columns even when they never occur in the data? For example,

data.1 <- c(1, 2, 1, 2, 1, 2, 4)
data.2 <- c(1, 4, 3, 3, 3, 1, 1)

table(data.1, data.2)

returns

      data.2
data.1  1 3 4
      1 1 2 0
      2 1 1 1
      4 1 0 0

where there's a missing 3 in the rows and a missing 2 in the columns, because they don't appear in the data.

Is there a simple way to force additional rows and columns of zeros to be inserted in the correct place, and instead return the following?

      data.2
data.1  1 2 3 4
      1 1 0 2 0
      2 1 0 1 1
      3 0 0 0 0
      4 1 0 0 0

Solution

  • You need to convert your vectors to factors, with each vector having all the levels you want to include in your output.

    levs <- sort(union(data.1, data.2))
    table(factor(data.1, levs), factor(data.2, levs))
    #    
    #     1 2 3 4
    #   1 1 0 2 0
    #   2 1 0 1 1
    #   3 0 0 0 0
    #   4 1 0 0 0