Search code examples

How is age classified as a categorical variable?

O.K this question is very basic, but i can't get it so need your help. I understand the idea of splitting age to categories. For example : good graph (:

I don't understand how the model knows that the 30< category is before the 31-45 category, why the 31-45 category is before the 46-60 category and etc. how the model knows not to make this graph - bad graph ):



  • Consider this example:

    age = 1:100
    fctr <- as.factor(cut(age, breaks = c(0,25,50,75,100)))
    [1] "(0,25]"   "(25,50]"  "(50,75]"  "(75,100]"

    There you can see, how the levels are ordered. This is the order that plot and ggplot2 will use. You can change this order in the following way:

    fctr2 <- factor(fctr,levels(fctr)[c(2,1,3,4)])
    [1] "(25,50]"  "(0,25]"   "(50,75]"  "(75,100]"

    If you are working more often with factors consider using the forcats package.