Search code examples
rr-factor

Only certain values of column as levels in factor


I have a dataframe with a column values as - "a, a, a, b, b, b, happy, sad, angry".

I want to convert the column to a factor using as.factor.

However I was wondering, is there possibly that certain values of columns can be grouped together as one level of the factor? Like, 'a, b' as one level of the factor, 'happy' as another level and so on?

How is it possible in code?

EDIT -

I tried to use:

allData$label <- factor(allData$label,
                        levels = c(1,2,3,4),
                        labels = c((c("a","b")),
                                   "happy", "sad", "angry"))

Since I wanted chars 'a' and 'b' as one label so I put a vector inside a vector. But it's giving me errors.


Solution

  • Yes. Use the labels option:

    
    x <- c("a","a","b","b","happy", "sad", "angry")
    levels = c("a", "b", "happy", "sad", "angry")
    labels = c("letter", "letter", "happy", "sad", "angry")
    
    y <- factor(x, levels, labels = labels)
    
    y
    

    https://rdrr.io/r/base/factor.html

    "Duplicated values in labels can be used to map different values of x to the same factor level."

    EDIT: Your mistake in the above code example is the nested vector.