Search code examples
rdataframerecoderowsum

Is there any way to replace values of df in R using sum of rows?


I have an issue which looks like easy to solve, but I'm stuck. I have a dataframe composed of columns (significant pathways retrieved from GSEA) and rows (entrez gene ids). In this data frame there are 1 if a gene is present in a pathway or 0 when not. This is my data frame:

                         Path_A      Path_B       Path_C
Gene_1                   0           1            0
Gene_2                   1           1            0
Gene_3                   0           0            1
Gene_4                   1           1            1

I want to sum the rows (genes) to calculate how many times a gene is present in distinct pathways, and thus get something like this:

                          Path_A      Path_B       Path_C
Gene_1                   0           1            0
Gene_2                   2           2            0
Gene_3                   0           0            1
Gene_4                   3           3            3

At this point, I tried using my_df <- mutate(my_df, sum = rowSums(my_df)) to create a new column sum and then recode the 1 with sum value for each pathway column; however, I failed.

Thanks in advance


Solution

  • You could use dplyr but the base R solution akrun posted is more reasonable:

    library(dplyr)
    
    df1 %>% 
      mutate(across(Path_A:Path_C, ~ .x * rowSums(across(Path_A:Path_C))))
    

    returns

           Path_A Path_B Path_C
    Gene_1      0      1      0
    Gene_2      2      2      0
    Gene_3      0      0      1
    Gene_4      3      3      3