Search code examples
rdataframegroup-byrowaddition

add row after each row in R data frame


I have data that looks like this :

X            snp_id        is_severe encoding_1 encoding_2 encoding_0
1     0  GL000191.1-37698         0          0          1          7
3     2  GL000191.1-37922         1          1          0         12

what I wish to do is to add a new row after every row that will contain the previous snp_id value and the is_sever value will be 1 if the previous is 0 and 0 if the previous is 1 (the goal is that every value of snp_id will have zero and one in is_severe column and not only zero or one ( and every snp_id will appear twice once with is_sever =zero and once with is_sever=1 all values of snp_id in the data are unique ) . Also, the encoding_1 & ancoding_2 will have the value 0 and the encoding_0 column will follow the equation: if in the new row the is_severe value is 0 the encoding_0 will be =8 and if in the new row the is_severe value is 1 the encoding_0 will be =13

Examples of desired output:

X            snp_id            is_severe encoding_1 encoding_2 encoding_0
    1     0  GL000191.1-37698         0          0          1          7
    2     1  GL000191.1-37698         1          0          0          13  <- new row 
    3     2  GL000191.1-37922         1          1          0         12
    4     3  GL000191.1-37922         0          0          0          8  <- new row

i saw a similar QA here:How can I add rows to an R data frame every other row? but i need to do more data manipulation and unfortunately this solution doesn't solve my problem . thank you:)


Solution

  • here are two options. 1) split and map, 2) copy and bind

    library(tidyverse)
    
    dat <- read_table("snp_id        is_severe encoding_1 encoding_2 encoding_0
    GL000191.1-37698         0          0          1          7
    GL000191.1-37922         1          1          0         12")
    
    dat |>
      group_split(snp_id) |>
      map_dfr(~add_row(.x, 
                       snp_id = .x$snp_id,
                       is_severe = 1 - (.x$is_severe == 1),
                       encoding_1 = 0, 
                       encoding_2 = 0,
                       encoding_0 = ifelse(.x$is_severe == 1, 8, 13)))
    #> # A tibble: 4 x 5
    #>   snp_id           is_severe encoding_1 encoding_2 encoding_0
    #>   <chr>                <dbl>      <dbl>      <dbl>      <dbl>
    #> 1 GL000191.1-37698         0          0          1          7
    #> 2 GL000191.1-37698         1          0          0         13
    #> 3 GL000191.1-37922         1          1          0         12
    #> 4 GL000191.1-37922         0          0          0          8
    

    or

    library(tidyverse)
    
    
    bind_rows(dat,
              dat |> 
                mutate(is_severe = 1 - (is_severe == 1),
                       across(c(encoding_1, encoding_2), ~.*0),
                       encoding_0 = ifelse(is_severe == 1, 13, 8))) |>
                arrange(snp_id)
    #> # A tibble: 4 x 5
    #>   snp_id           is_severe encoding_1 encoding_2 encoding_0
    #>   <chr>                <dbl>      <dbl>      <dbl>      <dbl>
    #> 1 GL000191.1-37698         0          0          1          7
    #> 2 GL000191.1-37698         1          0          0         13
    #> 3 GL000191.1-37922         1          1          0         12
    #> 4 GL000191.1-37922         0          0          0          8