Search code examples
rdataframemin

Find youngest offspring using min() in R


I have a pedigree dataset and for some calculations and estimations I need to find the birthyear of an individuals youngest offspring. I have tried the min() function which I think plays an important role together with match() to match parent ID to individual ID, but this only gives me NA as an answer. Any ideas how I might solve this?

id <- 1:30
momid <- c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 1,2,1,2,6,8,6,10,11,13,23,19,16,13,16,20,19,16,19,20,23)
dadid <- c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 3,4,5,5,7,4,9,7,7,14,24,7,15,18,18,17,21,14,18,21,17)
birthyear <- c(1975, 1975, 1976, 1977, 1977, 1977, 1977, 1978, 1978, 1980, 1981, 1982, 1982, 1984, 1984, 1985, 1985, 1979, 1988, 1989, 1990, 1990, 1991, 1992, 1993, 1993, 1993, 1995, 1995, 1996)
df <- data.frame(id, momid, dadid, birthyear)

min(df$birthyear[match(df$id, df$momid)])
[1] NA
with(df, min(birthyear[match(momid, id)]))
[1] NA

Solution

  • This is @GKi's answer, so it is NOT my own answer. But GKi posted it as a comment, so I am reposting it as an answer to close the question.

    df$firstoffspring <- sapply(df$id, function(i) min(df$birthyear[df$momid == i | df$dadid==i], na.rm=TRUE))