Search code examples
rdummy-variable

How to attribute values to a group according to dataset information


I need a dummy variable to identify the mothers whose children have High IQ, whereas this question is not directly asked.

I'm no quite familiar with R, thus my question might be simple. This is the data used:

data <- data.frame(family=c(1,1,1,2,2,2,3,3,3,3),
                  position=c("mother","son","father",
                         "mother","son","father",
                         "mother","son","son","son"),
              sex=c(0,1,1,0,0,1,0,1,0,0),
              highiq=c(0,0,1,1,1,1,0,0,1,0))

Where family tells which observations compose a family group, positions informs the individual's position in the family, sex it's sex and highiq is equal to 1 when the observation shows a High IQ. I've already managed to identify sons with a high IQ by:

dat2 <- dat%>%
  mutate(high.son = position=="son"& highiq==1)

But I can't go further. I imagine a possible solution would be to create a dummy variable for mothers (is.mother), and another dummy variable that attributes 1 to every member of the family in which at least one son has High IQ (IQ.family)-if the family has no sons with a high IQ, then they all get 0-, and then multiply the is.mother dummy variable by this new IQ.family variable, for what we should get only mothers with high IQ sons.

One problem I had with this strategy was that i couldn't generate the variable IQ.family, because i can't find a way to attribute values to a group according others informations of the dataset.


Solution

  • any(highiq[position == 'son'] == 1) is TRUE if highIQ == 1 for any son. So you need to assign that value to the new variable if position == 'mother', and assign 0 otherwise.

    library(dplyr)
    data %>% 
      group_by(family) %>% 
      mutate(mother_highIQ_son = 
               ifelse(position == 'mother', any(highiq[position == 'son'] == 1), 0))
    
    # # A tibble: 10 x 5
    # # Groups:   family [3]
    #    family position   sex highiq mother_highIQ_son
    #     <dbl> <chr>    <dbl>  <dbl>             <dbl>
    #  1      1 mother       0      0                 0
    #  2      1 son          1      0                 0
    #  3      1 father       1      1                 0
    #  4      2 mother       0      1                 1
    #  5      2 son          0      1                 0
    #  6      2 father       1      1                 0
    #  7      3 mother       0      0                 1
    #  8      3 son          1      0                 0
    #  9      3 son          0      1                 0
    # 10      3 son          0      0                 0
    

    The == 1 isn't strictly necessary but if it's not included you get a warning about type coercion.