Search code examples
rdataframemergesubsetextract

R: extracting certain rows from dataframe


here's an example dataframe:

df1<-data.frame(c("A","B","C","D","F","G"),c(1,4,2,5,3,6), c(50,23,52,15,40,49))
colnames(df1)<-c("individual.ID","brood","weight")

df2<-data.frame(c("B","D","F"),c(1,2,3))
colnames(df2)<-c("mother.ID","brood.mother")

I'm trying to solve an assignment for my coding class and I've been on it for two hours now, and can't seem to solve it. The assignment is to plot the weight of the mothers against the weight of the individuals. The mothers are included in the data of the individuals. The brood in df1 denotes the brood this individual was birthed in, the brood in df2 denotes which brood the mother has given birth to (hope that's not too confusing)

I cannot figure out how to extract the data of the mothers from the dataframe. I think it would be best if I could separate the first column of the df1 into mothers and non-mothers. I suspect that have to use merge() somehow, because that was the main topic of the class. I've tried to extract the data by simple indexing and subsetting using the character string of the mothers IDs, but it doesn't work (and is probably stupid to try)

df1[df2$mother,]

subset(df1, ID==df2$mother, select="weight")

Indexing or subsetting manually is not an option, because the Data I use are several hundred rows long.

Also, we are only allowed to use the R base package, so I'd be happy about any ideas that work with that!

Thanks :-)


Solution

  • This can address you to some clear path. Merge dataframes and create a flag for mother/non-mother:

    #Code
    Merged <- merge(df1,df2,by.x=c('individual.ID'),by.y = c('mother.ID'),all.x = T)
    #Flag
    Merged$Flag <- 'Non-Mother'
    Merged$Flag[!is.na(Merged$brood.mother)]<-'Mother'
    

    Output:

    Merged
      individual.ID brood weight brood.mother       Flag
    1             A     1     50           NA Non-Mother
    2             B     4     23            1     Mother
    3             C     2     52           NA Non-Mother
    4             D     5     15            2     Mother
    5             F     3     40            3     Mother
    6             G     6     49           NA Non-Mother