Search code examples
rstatacategorical-datasummary

Create percentage bar chart in Stata for categorical variables from R code


I'm new to Stata and trying to recreate my R code there. I have two factor variables I want to plot. One is expenses, and takes the values "Afs 2500-5000" "Afs 5000-7500" "Afs 7500-10000" "Less than Afs 2500" "More than Afs 10000".

The other one is education level, and takes the values "High school" "Madrassa" "No schooling" "Other" "Primary school" "Secondary school"

To plot a bar chart with percentages, I used

educ <- with(data, table(expenses, education))
education <- round(prop.table(educ,2)*100,digits=0)
barplot(prop.table(education,2)*100,
        xlab='Education level',ylab='Percentages',main="Monthly expenses by education status",beside=T, col = ramp.list,
        legend=rownames(education), args.legend
        = list(x = "topleft", cex=0.3))

which gave me this: percentage bar chart

How can I do the same in Stata? There seems to be no easy way to recode a variable like in R with as.factor. The closest I've got was this:

encode Education, generate(educ)
tabulate Expenditure educ, col

table Expenditure, stat(fvpercent educ) won't work.

What is the equivalent of as.factor in R and how can I generate visualisations like the one I presented above? Thanks!


Solution

  • In the absence of a data example:

    Note first that your graph is messed up because the categories for both predictor and outcome are alphabetically (alphanumerically) ordered. For example, the expenses variable should start with "Less than Afs 2500" and the education variable should start at "No schooling" and follow with "Primary School".

    On the main question: in Stata you could use graph bar or any command that is a wrapper for that.

    This example is reproducible:

    sysuse auto, clear 
    
    set scheme s1color 
    
    ssc install catplot 
    
    catplot rep78, over(foreign) percent(foreign) recast(bar) ///
    asyvars bar(1, color(red)) bar(2, fcolor(red*0.4) lcolor(red)) ///
    bar(3, fcolor(blue*0.2) lcolor(blue)) bar(4, fcolor(blue*0.6) lcolor(blue)) ///
    bar(5, color(blue)) legend(row(1)) ytitle(% of Domestic or foreign) yla(, ang(h))
    

    enter image description here