Search code examples
rforcats

Convert to factor and then display in a custom order on graph


I have a character column in the following data frame. I would like to change it to factor in such a way that when I use ggplot to plot results, the labels on my x-axis come in a specific order:

  df <-  structure(list(Level = c("1", "1", "1", "1", "1", "2", "1", "1"
), Variable = c("lskill_wc", "Grande_Estab", "lskill_wc", "lskill_bc", 
"hskill_wc", "balcadv", "hskill_bc", "Vinculos_Micro"), estimate = c(0.154462929180099, 
-0.00565989816383741, 0.127039272664461, 0.244657086455149, 0.153358091697942, 
-0.00769107968294057, -0.00592547333520778, 0.138216262540319
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-8L))

I want to change the first two columns to factors, recode them and change their order:

  Level Correspondence
  <chr> <chr>         
1 1     A         
2 2     B    

  Variable       Correspondence         
  <chr>          <chr>                  
1 lskill_wc      Low skill white collar     
2 lskill_bc      Low skill blue collar  
3 hskill_bc      High skill white collar
4 Grande_Estab   Large firm                       
5 Vinculos_Micro Employment
6 balcadv        Comp. Adv

The order to be shown on a graph would be the labels given according to the one in each Correspondence column.


Solution

  • Since you have completely changed the question, my original answer has become invalidated. I have been compelled to change it to avoid attracting downvotes.

    It is probably better to ask a new question in this situation.

    The answer is in exactly the same format as the original; you seem to just be getting mixed up about your factor levels. Your list of desired ordering of factors actually misses one out - "High skill blue collar worker"

    labels <- c( "Comp. Adv", "Large firm", "High skill blue collar worker", 
                 "High skill white collar worker", "Low skill blue collar worker",
                 "Low skill white collar worker", "Employment")
    df$label <- as.factor(df$Variable)
    levels(df$label) <- labels
    df$label <- factor(df$label, labels[c(6, 5, 4, 3, 2, 7, 1)])
    df
    #> # A tibble: 8 x 4
    #>   Level Variable       estimate label                         
    #>   <chr> <chr>             <dbl> <fct>                         
    #> 1 1     lskill_wc       0.154   Low skill white collar worker 
    #> 2 1     Grande_Estab   -0.00566 Large firm                    
    #> 3 1     lskill_wc       0.127   Low skill white collar worker 
    #> 4 1     lskill_bc       0.245   Low skill blue collar worker  
    #> 5 1     hskill_wc       0.153   High skill white collar worker
    #> 6 2     balcadv        -0.00769 Comp. Adv                     
    #> 7 1     hskill_bc      -0.00593 High skill blue collar worker 
    #> 8 1     Vinculos_Micro  0.138   Employment  
    
    levels(df$label)
    #> [1] "Low skill white collar worker"  "Low skill blue collar worker"  
    #> [3] "High skill white collar worker" "High skill blue collar worker" 
    #> [5] "Large firm"                     "Employment"                    
    #> [7] "Comp. Adv"
    

    Created on 2020-03-04 by the reprex package (v0.3.0)