I want to do a simple Excel-eqsue COUNTIFS computation in Stata.
I have four variables
AllCEOs Year TwoSic1 TwoSic2
. AllCEOs
is a dummy marking valid entries (i.e. 0/1). Year
is year. TwoSic1
and TwoSic2
are string variables that describe industries.
I now want to count per year how many AllCEOs
are there in a Year
given a specific TwoSic?
. The step where I am struggling is that I need to count through both TwoSic1
and TwoSic2
as the industry ID i am summing over might be in either variable.
As far as I know, I cannot use a simple egen = total(AllCEOs), by (TwoSic1 TwoSic2 Year)
as it does not account for the fact that I can have the case where a value in TwoSic2
is present in the TwoSic1
variable and vice versa.
So far I have tried to follow advice from other threads (is there an equivalent to the COUNTIFS function of Excel in Stata?) but the results do not make any sense in my case.
Edit: To clarify some data and what I want to achieve. I (simply) want to count. _CountTwoSic1[row1] shall depict often the value of TwoSic1[row1] can be found in columns TwoSic1
and TwoSic2
for a given year. I then want to replicate this for TwoSic2 as well and sum them up in the end, but this should not be the issue.
AllCEOs | Year | TwoSic1 | TwoSic2 | _CountTwoSic1 |
---|---|---|---|---|
1 | 2019 | 15 | 16 | 1 |
1 | 2019 | 16 | 17 | 1 |
1 | 2019 | 13 | 15 | 2 |
1 | 2019 | 13 | 15 | 2 |
1 | 2018 | 15 | 16 | 1 |
1 | 2018 | 16 | 17 | 1 |
1 | 2018 | 13 | 15 | 1 |
Thanks for the data example, but I don't follow your hesitation. The code you suggest reproduces your desired output for the example given.
* Example generated by -dataex-. For more info, type help dataex
clear
input byte allceos int year byte(twosic1 twosic2 _counttwosic1)
1 2019 15 16 1
1 2019 16 17 1
1 2019 13 15 2
1 2019 13 15 2
1 2018 15 16 1
1 2018 16 17 1
1 2018 13 15 1
end
bysort year twosic* : egen wanted = total(allceos)
list, sep(0)
+--------------------------------------------------------+
| allceos year twosic1 twosic2 _count~1 wanted |
|--------------------------------------------------------|
1. | 1 2018 13 15 1 1 |
2. | 1 2018 15 16 1 1 |
3. | 1 2018 16 17 1 1 |
4. | 1 2019 13 15 2 2 |
5. | 1 2019 13 15 2 2 |
6. | 1 2019 15 16 1 1 |
7. | 1 2019 16 17 1 1 |
+--------------------------------------------------------+