Search code examples
rfunctionanovat-test

How to generate "Unbalanced" (unequal) factor levels in Base?


Background:

I usually use gl() to generate "equal" number of factor levels for a set of random variables. For example, to generate 2 factor levels for 60 random variables in x, I use the following:

x = rnorm(n = 60)
groups = gl( 2, length(x)/2 ) ## My Factor Levels

But above, doesn't allow me to produce, say, 40 factor levels for the first 40 elements in x, and 20 factor levels for the last 20 elements in x (i.e., "Unbalanced" (unequal) factor levels).

Question:

In base R, is there a flexible function or a strategy to produce "Unbalanced" (unequal) factor levels?


Solution

  • You can use rep with a vector-valued times argument:

    x <- factor( rep(1:3, times=c(5,10,2)) )
    x
    

    This gives:

    [1] 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
    Levels: 1 2 3