We measured antibody levels in different age groups, and our sample size for each group was different.
I would like to add the sample size of the respective group at the top of my box plot (i.e. sample size in toddler boys). The photo attached shows one of my bar graphs.
my code to create the bar graph:
graph box igg1ugml_log10, over(female_sex) over(age_groups) ///
bar(samplesize) graphregion(color(white)) ///
title(Anti-EPEC IgG1 (ug/ml) in boys and girls) asyvars ///
ylabel(2.69897 "500" 3 "1,000" 3.3 "2000" 3.69 "5000" 3.95 "9000")
For adding text to graph box
, use the documented text()
option. Here is a reproducible example. Other than using the Graph Editor, I don't have any recipe for working out text position other than fiddling towards what looks good enough.
sysuse auto, clear
gen logprice = log10(price)
ssc install mylabels
su price
mylabels 3000(2000)15000, myscale(log10(@)) local(yla)
graph box logprice, over(foreign) yla(`yla', ang(h)) ///
text(4.25 21.2 "{it:n} = 52") text(4.25 79.8 "{it:n} = 22") ///
ysc(r(. 4.3)) scheme(s1color) ytitle(Price (USD))
Note. To show the mu of microgram properly, see help graph text
in Stata and search for Greek letters.
EDIT
stripplot
from SSC can produce box plots too, although both its defaults and its possibilities differ from graph box
. Here is a reproducible example.
sysuse auto, clear
egen count = count(mpg), by(rep78)
gen where = 10.5
stripplot mpg , box vertical ms(none) pctile(5) over(rep78) ///
yla(12 41 15(5)40, ang(h)) ///
addplot(scatter where rep78, mla(count) ms(none) mlabpos(0) ///
mlabsize(medsmall)) scheme(s1color)
Again, although this is reproducible code, the choice of 10.5 results from play with other values not shown here. You could try to automate a choice with a calculation based on the sample maximum and minimum and, naturally, your preference for where it should be. If you were producing dozens of these, that would be a good idea. For a single plot for a paper or presentation, I would just play.