My dataset looks like this:
Country | year | poverty rate | sales |
---|---|---|---|
Austria | 1950 | 0.54 | 142 |
Austria | 1951 | 0.32 | 12441 |
Austria | 1952 | 0.32 | 12441 |
Bangladesh | 1950 | 0.11 | 142123123 |
Bangladesh | 1951 | 0.52 | 1234 |
Bangladesh | 1952 | 0.32 | 12441 |
Sri Lanka | 1950 | 0.95 | 4215 |
Sri Lanka | 1951 | 0.21 | 142421 |
Sri Lanka | 1952 | 0.32 | 12441 |
I want to do tsset
so that I can (for example) create a new variable for change in sales per year for each country. When I try to do tsset country year
, I see "repeated time values within panel". How can I create a new variable that is change in sales per year for each country and year? I have more variables so I would want to be able to specify the variable.
country
looks like a string variable from here, but if it were then
tsset country year
would fail for that reason. So, suppose country
is a numeric variable with value labels. Then it is essential to follow up the report of repeated observations with say
duplicates list country year
duplicates tag country year, gen(tag)
edit if tag
Then the next step depends on what you see, for example:
The duplicates are just junk with missing values on one of those variables. drop
the junk.
Accidental duplicate observations. drop
the duplicates.
Something more serious.
See also FAQ https://www.stata.com/support/faqs/data-management/repeated-time-values/