Search code examples
datestata

Assessing which person is the one with the next birthday


In Stata I am trying to assess which of the given birthdays is the next one compared with a given date. My data looks like this:

  • All dates are in daily format (%dD_m_Y), e.g. 18mar1926
  • Variable date which is the reference date with which all other dates should be compared
  • Variables birth1, birth2, birth3, birth4, birth5, birth6 contain the birthday of all possible household members.

For example: A household with two adults A and B. The birthday of A is 20th Nov 1977 and the birthday of person B is 30th March 1978. The reference date is 29.11.2020. I want to know who is the person who has the next birthday, in the example above it is person B, because person A has had its birthday one week before the reference date, so the next birthday in this household will be celebrated on the 30 March 2021.

Example data:

date birth1 birth2 birth3 birth4 birth5 birth6
02feb2021 15jan1974 27nov1985
30nov2020 31aug1945 27jun1999 07apr1997
19nov2020 27sep1993 30dec1996
29jan2021 29mar1973
05dec2020 21jan1976 02oct1976 21jan1976 25may1995 15feb1997
25nov2020 25nov1943 29nov1946
02feb2021 28apr1979

Solution

  • EDITED to account for Feb 29

    *The edit will treat people who have a February 29 birthday as if it were March 1 in cases when the year of date is not a leap year. If that doesn't make sense for your particular use case, it should be easy to alter the code below as you see fit.

    Since you want the next birthday in the year rather than the closest birthday, you can use the year of date and the month and day from birth{i} to create a date for each person's next birthday. Then you can sinmply take the earliest value from each household. I reshape long, and generate a person and household id in order to do this.

    Make example data

    clear
    set obs 6
    set seed 1996
    generate date = floor((mdy(12,31,2020)-mdy(12,1,2015)+1)*runiform() + mdy(12,1,2015))
    format date %td
    
    forvalue i = 1/6 {
        gen birth`i' = floor((mdy(12,31,1996)-mdy(12,1,1980)+1)*runiform() + mdy(12,1,1980)) if _n < `i' == 0
        format birth`i'  %td
    }
    
    replace birth6 = birth4 in 6 // want a tie
    replace birth2 = date("29feb1996","DMY") in 3 // Feb 29
    

    Find Next Birthday

    gen household_id = _n
    reshape long birth, i(date household_id) j(person)
    drop if mi(birth)
    
    gen person_next_birthday = mdy( month(birth), day(birth), year(date))
    * TREAT FEB 29 as if they have a march 1 birthday in non-leap years
    replace person_next_birthday = mdy(3,1,year(date)) if month(birth) == 2 ///
    & day(birth) == 29 & mod(year(date),4)!=0
    replace person_next_birthday = mdy( month(birth), day(birth), year(date) + 1) if person_next_birthday < date 
    replace person_next_birthday = mdy(3,1,year(date)+1) if month(birth) == 2 ///
    & day(birth) == 29 & mod(year(date) + 1,4)!=0 & person_next_birthday < date
    format person_next_birthday  %td
    
    bysort household_id  (person_next_birthday): gen next_bday = person_next_birthday[1]
    format next_bday %td
    drop person_next_birthday
    
    reshape wide birth, i(date household_id next_bday) j(person)
    
    gen next_bday_persons = ""
    * Make a string to present household persons who have next bday
    foreach v of varlist birth* {
        local person = subinstr("`v'","birth","",.)
        local condition = "month(`v') == month(next_bday) & day(`v') == day(next_bday)"
        local condition_feb29 = "month(next_bday) == 3 & day(next_bday) == 1 & month(`v') == 2 & day(`v') == 29"
        replace next_bday_persons = next_bday_persons + "|`person'" if `condition' | `condition_feb29'
    }
    replace next_bday_persons = regexr(next_bday_persons,"^\|","")
    order next_bday_persons, after(next_bday)
    

    The last loop is unnecessary, but illustrates that this is robust to ties.