Search code examples
rr-factorequivalence

In R, how can I test if two factors are equivalent?


I am generating a big list of factors with different levels, and I want to be able to detect when two of them define the same partition. For example, I want to detect all of the following as equivalent to each other:

x1 <- factor(c("a", "a", "b", "b", "c", "c", "a", "a"))
x2 <- factor(c("c", "c", "b", "b", "a", "a", "c", "c"))
x3 <- factor(c("x", "x", "y", "y", "z", "z", "x", "x"))
x4 <- factor(c("a", "a", "b", "b", "c", "c", "a", "a"), levels=c("b", "c", "a"))

What is the best way to do this?


Solution

  • I guess you want to establish that a two-way tabulation has the same number of populated levels as a one way classification. The default setting in interaction is to represent all levels even if not populated but setting drop=TRUE changes it to suit your purpose:

    > levels (interaction(x1,x2, drop=TRUE) )
    [1] "c.a" "b.b" "a.c"
    > length(levels(x1) ) == length(levels(interaction(x1,x2,drop=TRUE) ) )
    [1] TRUE
    

    The generalization would look at all( <the 3 necessary logical comparisons> ):

     all( length(levels(x1) ) == length(levels(interaction(x1,x2,drop=TRUE) ) ),
          length(levels(x1) ) == length(levels(interaction(x1,x3,drop=TRUE) ) ),
          length(levels(x1) ) == length(levels(interaction(x1,x4,drop=TRUE) ) ) )
    #[1] TRUE