Search code examples
rdataframetidyverse

Multiply values of dataframes in R


I have two dataframes and I want to apply formula on values of df1 (column 6 to second last) with the values of column two in df2 and save them as new columns in df1.

Data:

df1 <- read.table(text = "
chr start   end strand  ref.bam bam1    bam2    bam3    bam4    bam5    bam6    bam7    bam8    length
chr1    3531569 3531966 -   128 2   2   1   4   8   36  21  1   397
chr1    3670538 3672624 -   1322    251 50  170 165 294 259 665 86  2086
chr1    4491645 4493854 -   1012    220 46  179 167 275 332 414 77  2209
chr1    4496542 4497750 -   155 115 41  100 67  114 69  42  63  1208
chr1    4571267 4572265 -   428 64  32  77  44  76  130 179 27  998
chr1    4688213 4688719 -   76  39  10  20  20  14  23  25  17  506", header = TRUE)

df2 <- read.table(text = "bam_file    total_dedup_reads
bam1    27220427
bam2    9347059
bam3    19711791
bam4    14089305
bam5    14677615
bam6    8762480
bam7    3510947
bam8    35326515", header = TRUE)

I tired this but I'm not getting the desired results.

x <-df1 %>% mutate(across(6:(ncol(df1)-1), 
                   ~(.x * 10^9)/(df2$total_dedup_reads[which(names(df1[ , c(6:(ncol(df1)-1))]) == cur_column())] * length),
                   .names = "pro_{.col}"))

x <-df1 %>% mutate(across(6:(ncol(df1)-1), 
                  ~(.x * 10^9)/(df2$total_dedup_reads[which(names(df1[ , c(6:(ncol(df1)-1))]) == cur_column())] * length),
                   .names = "pro_{.col}"))

Solution

  • Reshape wide-to-long, then merge, do the calculations, and reshape it back long-to-wide:

    df1 %>% 
    pivot_longer(cols = starts_with("bam"), names_to = "bam_file", values_to = "x") %>% 
      left_join(df2, by = "bam_file") %>% 
      mutate(x = (x * 10^9) / total_dedup_reads / length) %>% 
      select(-total_dedup_reads) %>% 
      pivot_wider(names_from = bam_file, values_from = x)
      
    ## A tibble: 6 x 14
    #   chr     start     end strand ref.bam length  bam1  bam2  bam3  bam4  bam5  bam6  bam7   bam8
    #   <chr>   <int>   <int> <chr>    <int>  <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
    # 1 chr1  3531569 3531966 -          128    397 0.185 0.539 0.128 0.715  1.37 10.3  15.1  0.0713
    # 2 chr1  3670538 3672624 -         1322   2086 4.42  2.56  4.13  5.61   9.60 14.2  90.8  1.17  
    # 3 chr1  4491645 4493854 -         1012   2209 3.66  2.23  4.11  5.37   8.48 17.2  53.4  0.987 
    # 4 chr1  4496542 4497750 -          155   1208 3.50  3.63  4.20  3.94   6.43  6.52  9.90 1.48  
    # 5 chr1  4571267 4572265 -          428    998 2.36  3.43  3.91  3.13   5.19 14.9  51.1  0.766 
    # 6 chr1  4688213 4688719 -           76    506 2.83  2.11  2.01  2.81   1.89  5.19 14.1  0.951