Search code examples
rpivot-tablezeroedge-list

Error in transforming blank spaces into 0 after pivoting_wider for performing DGEList function in R


the df I'm working on originally is a long format table like below:-

VALUE Gene_Symbol Sample_ID
12253 BRCA P1
42356 CAMP P2

Then for generating the DGEList, I decided to transform it into a wide format and generated below table:

P1 P2 P3 P4
null 2423 46456 74564 523424
CAMP 42356
BRCA 12253 453 658665

because some samples may not express a certain gene, hence the console leave it blank when I wide pivot it. When I view() the df, it showed as blank. But when I do summery() it shows as NULL in the console.

Right now, I am trying to use apply() to replace the blank with 0 but with no luck, all values turned into 0.


Solution

  • The values_fill function in tidy::pivot_wider should do the trick:

    tidyr::pivot_wider(df, 
                         names_from = Sample_ID, 
                         values_from = VALUE, 
                         values_fill = 0)
    

    Output:

    #   Gene_Symbol    P1    P2
    #   <chr>       <int> <int>
    # 1 BRCA        12253     0
    # 2 CAMP            0 42356
    

    Data

    df <- read.table(text = "VALUE  Gene_Symbol Sample_ID
    12253   BRCA    P1
    42356   CAMP    P2
    ", h = T)
    

    Generally, you could replace the values of NA in a data frame without apply using something like this:

    df_na <- tidyr::pivot_wider(df, names_from = Sample_ID, values_from = VALUE)
    
    #   Gene_Symbol    P1    P2
    #   <chr>       <int> <int>
    # 1 BRCA        12253    NA
    # 2 CAMP           NA 42356
    
    df_na[is.na(df_na)] <- 0
    
    #   Gene_Symbol    P1    P2
    #   <chr>       <int> <int>
    # 1 BRCA        12253     0
    # 2 CAMP            0 42356