Search code examples
variablessumstata

How to compute the sum of some variables in Stata?


I have 48 variables in my dataset: first 12 concern year 2000, second 12 year 2001, third 12 year 2002 and fourth 12 year 2003. Each single variable contains the values in such a way:

ID var1 var2 var3 ... var12 ... var48
xx 0 0 1 ... 1 ... 0
yy 1 0 0 ... 9 ... 0
zz 3 2 1 ... 0 ... 0

Now, I want to collect the sum of the values of the first 12 variables in another one called, say, "tot_2000" which should contain just one number (in this example it is 18). Then, I must repeat this passage for the 3 remaining years, thus having 4 variables ("tot_2000", "tot_2001", "tot2002", "tot2003") to be plotted in an histogram.

What I'm looking for is such a variable:

tot_2000
18

Solution

  • ORIGINAL QUESTION, addressed by @TheIceBear and myself.

    I have a dataset that contains, say, 12 variables with values 0,1,2.... like this, for example:

    ID var1 var2 var3 ... var12
    xx 0 0 1 ... 1
    yy 1 0 0 ... 9
    zz 3 2 1 ... 0

    and I want to create a variable that is just the sum of all the values (18 in this case), like:

    tot_var 18

    What is the command?

    FIRST ANSWER FROM ME

    Here is another way to do it, as indicated in a comment on the first answer by @TheIceBear.

    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str2 ID byte(var1 var2 var3 var4)
    "xx" 0 0 1 1
    "yy" 1 0 0 9
    "zz" 3 2 1 0
    end
    
    mata : total = sum(st_data(., "var1 var2 var3 var4")) 
    
    mata : st_numscalar("total", total)
    
    di scalar(total)
    18
    
    

    The two Mata commands could be telescoped.

    SECOND ANSWER

    A quite different question is emerging slowly from comments and edits. The question is still unfocused, but here is an attempt to sharpen it up.

    You have monthly data for various identifiers. You want to see bar charts (not histograms) with annual totals.

    The data structure or layout you have is a poor fit for handling such data in Stata. You have a so-called wide layout but a long layout is greatly preferable. Then your totals can be put in a variable for graphing.

    * fake dataset 
    clear
    set obs 3 
    gen id = word("xx yy zz", _n)
    
    forval j = 1/48 { 
        gen var`j' = _n * `j'
    }
    
    * you start here 
    reshape long var, i(id) j(time)
    gen mdate = ym(1999, 12) + time 
    format mdate %tm 
    gen year = year(dofm(mdate))
    
    * not clear that you want this, but it could be useful 
    egen total = total(var), by(id year)
    twoway bar total year, by(id) xla(2000/2003) name(G1, replace)
    
    * this seems to be what you are asking for 
    egen TOTAL = total(var), by(year)
    twoway bar TOTAL year, base(0) xla(2000/2003) name(G2, replace)