Search code examples
rdataframereplacerowreplicate

Replicate rows with missing values and replace missing values by vector


I have a dataframe in which a column has some missing values. I would like to replicate the rows with the missing values N times, where N is the length of a vector which contains replacements for the missing values.

I first define a replacement vector, then my starting data.frame, then my desired result and finally my attempt to solve it. Unfortunately that didn't work...

> replace_values <- c('A', 'B', 'C')
> data.frame(value = c(3, 4, NA, NA), result = c(5, 3, 1,2))
  value result
1     3      5
2     4      3
3    NA      1
4    NA      2
> data.frame(value = c(3, 4, replace_values, replace_values), result = c(5, 3, rep(1, 3),rep(2, 3)))
  value result
1     3      5
2     4      3
3     A      1
4     B      1
5     C      1
6     A      2
7     B      2
8     C      2
> t <- data.frame(value = c(3, 4, NA, NA), result = c(5, 3, 1,2))
> mutate(t, value = ifelse(is.na(value), replace_values, value))
  value result
1     3      5
2     4      3
3     C      1
4     A      2

Solution

  • You can try a tidyverse solution

    d %>% 
      mutate(value=ifelse(is.na(value), paste0(replace_values, collapse=","), value)) %>% 
      separate_rows(value, sep=",") %>% 
      select(value, everything())
      value result
    1     3      5
    2     4      3
    3     A      1
    4     B      1
    5     C      1
    6     A      2
    7     B      2
    8     C      2
    

    The idea is to replace the NA's by the ,-collapsed 'replace_values'. Then separate the collpased values and binding them by row using tidyr's separate_rows function. Finally sort the data.frame according your expected output.