Search code examples
rdataframelevels

How can I sort a dataframe by a predetermined order of factor levels in R?


I have a data frame in which one column consists of unique factors. I want to sort this data frame by a predefined order of factor levels, independend of the original order.

For example my data looks like this:

label <- c('tree','lake','house', 'human')

number <- c(50,1,2,5)


df <- data.frame(

  group = label,

  value = number)

category_order <- category_order = c('tree','house','lake','human') 

where df has the form

     group number
1    tree  50
2    lake   1
3    house  2
4    human  5

but I would like it to be sorted in like category_oder so df_new looks like:

     group number
1    tree  50
2    house  2
3    lake   1
4    human  5

I know know that in this case I could just swap the second and third row, but in general I don't know in which order the facors will be in the data frame and I couldn't find a way to do this without having strong restrictions about what factors I can use and the order in which they shoud be in the end. (for example alphabetical order)


Solution

  • We can specify the levels of the 'group' as category_order and that use that to `arrange

    library(dplyr)
    df1 <- df %>% 
              arrange(factor(group, levels = category_order))
    df1
    #  group value
    #1  tree    50
    #2 house     2
    #3  lake     1
    #4 human     5
    

    Or using fct_relevel

    library(forcats)
    df %>% 
       arrange(fct_relevel(group, category_order))