Search code examples
stata

Use of count command


I am using a dataset, which among other variables includes the following:

. describe year country co brand

              storage   display    value
variable name   type    format     label      variable label
----------------------------------------------------------------------------------------------------------------
year            int     %9.0g                 year (=first dimension of panel)
country         byte    %9.0g      market     market (=second dimension of panel)
co              int     %9.0g                 model code (=third dimension of panel)
brand           byte    %21.0g     brand      brand code

After I load the dataset, I generate a new variable and declare my data to be panel:

egen yearcountry = group(year country), label
xtset co yearcountry 

I would like to estimate the market share of each brand in each country.

For example:

count if brand=="AlfaRomeo" & country=="Italy"

However, i get the following error:

type mismatch
r(109);

The entire dataset consisting of 11,483 observations can be downloaded from here.


Solution

  • The following works for me:

    . count if brand == 1 & country == 4
      111
    

    The variables brand and country are not string but numeric with value labels:

    . tabulate country
    
         market |
       (=second |
      dimension |
      of panel) |      Freq.     Percent        Cum.
    ------------+-----------------------------------
        Belgium |      2,641       23.00       23.00
         France |      2,252       19.61       42.61
        Germany |      2,281       19.86       62.47
          Italy |      2,020       17.59       80.07
             UK |      2,289       19.93      100.00
    ------------+-----------------------------------
          Total |     11,483      100.00
    
    . taulate country, nolabel
    
         market |
       (=second |
      dimension |
      of panel) |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              1 |      2,641       23.00       23.00
              2 |      2,252       19.61       42.61
              3 |      2,281       19.86       62.47
              4 |      2,020       17.59       80.07
              5 |      2,289       19.93      100.00
    ------------+-----------------------------------
          Total |     11,483      100.00
    

    However, note that what you are calculating here is not the market share, but the number of cars of a particular brand in a certain country. The percentage market share is usually defined as the ratio of unit sales and total market unit sales.

    The following code snippet will thus produce what you want:

    forvalues i = 1 / 47 {
        bysort year (country): egen a_`i' = total(brand == `i')
        bysort year (country):  gen b_`i' = (a_`i' / _N) * 100
    }
    
    collapse b_*, by(country year)
    

    You can also check that the results add up as follows:

    egen all = rowtotal(b_*)
    

    You could then see the market share for AlfaRomeo & Audi, for years 1970 & 1976 and for Belgium & France as follows:

    format b_* all %4.2f
    
    list year country b_1 b_2 all if inlist(year,1970, 1976) & inlist(country, 1, 2), noobs
    
      +---------------------------------------+
      | year   country    b_1    b_2      all |
      |---------------------------------------|
      | 1970   Belgium   3.31   7.35   100.00 |
      | 1976   Belgium   5.01   4.13   100.00 |
      | 1970    France   3.31   7.35   100.00 |
      | 1976    France   5.01   4.13   100.00 |
      +---------------------------------------+