Search code examples
chartsstatalongitudinal

estimating percent trend of category variable per year in Stata


I am trying to graph a trend line of categorical variable by gender in Stata. It would be like a percentage trend of each value of this categorical variable (level 1, 2, and 3) by gender and I can't seem to nail it down. I've generated a sample dataset for your review. Any help or advice would be greatly appreciated.

clear 

*generating random dataset
set obs 100 
set seed 2803 
egen year = seq(), from(2000) to(2010)
egen problemcat = seq(), to(3) block(3)
egen gender = seq(), to(2) block(5)
label def cat 1 problemthisyear 2 problemsincethen 3 noproblem 
label val problemcat cat 

*want to graph trend over time for each category, a percent for each category per year over time, by gender
*my attempt: 

gen problemthisyear = 1 if problemcat == 1
gen problemsincethen = 1 if problemcat == 2 
gen noproblem = 1 if problemcat == 3

poisson problemthisyear c.year i.gender, vce(robust) irr
predict probyhat, pr(1)

*my attempt at plotting but not sure how to separate it by gender
twoway (scatter probyhat year, lcolor(navy))

Solution

  • Presumably wanting a trend line means wanting the percentage breakdown of problems by gender and by year.

    I don't follow what Poisson regression is supposed to do here. On the face of it you're asking for descriptive statistics and in any case there are three outcome categories.

    The graph here uses tabplot from the Stata Journal.

    clear 
    
    *generating random dataset
    set obs 100 
    set seed 2803 
    egen year = seq(), from(2000) to(2010)
    egen problemcat = seq(), to(3) block(3)
    egen gender = seq(), to(2) block(5)
    label def cat 1 "this year" 2 "since then" 3 "no problem"
    label val problemcat cat 
     
    tabplot problemcat gender, by(year, row(1) note("") compact) showval(mlabsize(medium)) ytitle("") percent(gender year)
    

    enter image description here

    EDIT

    Here is code for a combined line plot using fabplot from the Stata Journal.

    bysort year (gender problemcat) : gen total = _N 
    by year gender problemcat : gen percent = 100 * _N / total 
    
    fabplot line percent year, by(gender problemcat) frontopts(lw(thick))