Search code examples
data-visualizationvisualizationstata

Adjusting Y axis values bar graph


I have the following Stata code which produces the graph below. However, I am trying to adjust/sort the values on the Y axis, which refers to workers in different income groups, to follow the same order as the dataset. Rather than the current order where those making <= 1.5 thousand dollars per month are arranged as the 5th rather than first income group.

number_of_workers   income_bracket
24806                <= 1.5
31346                 1.5-2.9
648409                   3
389266                3.01-4.9
351963                  5-9.9
271360                  >= 10

The income bracket variable is string and I tried to convert it as follows:

gen income_bracket_numeric = real(income_bracket)

However, Stata treats the converted values as missing (.), except for 3 which is numeric. Is there a way to deal with income bracket ranges by making it numeric, without altering the original data?

preserve
keep if nationality=="nationals"
keep if period=="Q1_2020" | period=="Q4_2020" 
graph hbar (mean)  number_of_workers, over(income_bracket) over(quarter)
restore

enter image description here

I incorporated Nick's advice, and my code looks as follows:

input number_of_workers_q1 str8 income_bracket_q1
24806 "<= 1.5"
31346 "1.5-2.9"
648409 "3"
389266 "3.01-4.9"
351963 "5-9.9"
271360 ">= 10"
end 


input number_of_workers_q4 str8 income_bracket_q4
25073 "<= 1.5"
29628 "1.5-2.9"
596767 "3"
442429 "3.01-4.9"
381794 "5-9.9"
273880 ">= 10"
end 

gen order = _n 
labmask order, values(income_bracket_q1)
graph hbar (asis) number_of_workers_q1 number_of_workers_q4, over(order)

label define order 1 "{&le} 1.5" 6 "{&ge} 10", modify 
graph hbar (asis) number_of_workers_q1 number_of_workers_q4, over(order)

And the graph works well except for that Y axis which looks as below: enter image description here


Solution

  • Note that your question is about tweaking the categorical axis. With graph bar, graph hbar and graph dot the magnitude axis is always considered to be the y axis, regardless of whether it is vertical or horizontal. This is done so you can change the orientation between horizontal and vertical without being obliged to change all the y options and all the x options.

    Here is one way to do it using labmask from the Stata Journal.

    clear 
    input number_of_workers str8 income_bracket
    24806 "<= 1.5"
    31346 "1.5-2.9"
    648409 "3"
    389266 "3.01-4.9"
    351963 "5-9.9"
    271360 ">= 10"
    end 
    
    gen order = _n 
    labmask order, values(income_bracket)
    graph hbar (asis) number_of_workers, over(order)
    
    label define order 1 "{&le} 1.5" 6 "{&ge} 10", modify 
    graph hbar (asis) number_of_workers, over(order)
    

    You can do it without labmask if you define value labels in the right order and then use encode.

    As above, you can improve on crude double symbols <= and >=.