Save list of distinct values of a variable in another variable

I have data at the country-year-z level, where z is a categorical variable that can take(say) 10 different values (for each country-year). Each combination of country-year-z is unique in the dataset.

I would like to obtain a dataset at the country-year level, with a new (string) variable containing all distinct values of z.

For instance let's say I have the following data:

country     year    z
A           2000    1
A           2001    1
A           2001    2
A           2001    4
A           2002    2
A           2002    5
B           2001    7
B           2001    8
B           2002    4
B           2002    5
B           2002    9
B           2003    3
B           2003    4
B           2005    1

I would like to get the following data:

country     year    z_distinct
A           2000    1
A           2001    1 2
A           2002    2 5
B           2001    7 8
B           2002    4 5 9
B           2003    3 4
B           2003    4

Solution

Here's another way to do it, perhaps more direct. If z is already a string variable the string() calls should both be omitted.

clear 
input str1 country year z
A 2000 1
A 2001 1
A 2001 2
A 2001 4
A 2002 2
A 2002 5
B 2001 7
B 2001 8
B 2002 4
B 2002 5
B 2002 9
B 2003 3
B 2003 4
B 2005 1
end 

bysort country year (z) : gen values = string(z[1]) 
by country year : replace values = values[_n-1] + " " + string(z) if z != z[_n-1] & _n > 1 
by country year : keep if _n == _N 
drop z 

list , sepby(country) 
     +-------------------------+
     | country   year   values |
     |-------------------------|
  1. |       A   2000        1 |
  2. |       A   2001    1 2 4 |
  3. |       A   2002      2 5 |
     |-------------------------|
  4. |       B   2001      7 8 |
  5. |       B   2002    4 5 9 |
  6. |       B   2003      3 4 |
  7. |       B   2005        1 |
     +-------------------------+