Search code examples
stata

Creating a variable with the highest number observed at a family level in Stata


Suppose I have raw data that look like this:

Group Family Individual Highest score
1 4 2 0
1 4 3 1
1 5 1 2
1 6 2 0
1 6 3 2

Here 0 = low, 1 = medium, 2 = high for highest score.

An individual is unique within a family, which is also unique within a group. I want to create a variable that contains highest score for each family - for example, for family no. 4, the new variable of highest score will be 1 for both individual 2 and 3, as it is the highest achieved within family no. 4.

I have tried

egen familyhighscore = max(highestscore), by(family) 

and

bysort family (highestscore): gen familyhighscore = family[_N]

and none of them worked as the way I wanted. How would I create this new "family high score" variable while retaining the same value labels as "highest score"?

I couldn't find an existing post that answers my question yet - apologies if it has been asked before.


Solution

  • If the family is unique only within a group, then the correct command is:

    egen familyhighscore = max(highestscore), by(group family)