the df I'm working on originally is a long format table like below:-
VALUE | Gene_Symbol | Sample_ID |
---|---|---|
12253 | BRCA | P1 |
42356 | CAMP | P2 |
Then for generating the DGEList, I decided to transform it into a wide format and generated below table:
P1 | P2 | P3 | P4 | |
---|---|---|---|---|
null | 2423 | 46456 | 74564 | 523424 |
CAMP | 42356 | |||
BRCA | 12253 | 453 | 658665 |
because some samples may not express a certain gene, hence the console leave it blank when I wide pivot it. When I view() the df, it showed as blank. But when I do summery() it shows as NULL in the console.
Right now, I am trying to use apply() to replace the blank with 0 but with no luck, all values turned into 0.
The values_fill
function in tidy::pivot_wider
should do the trick:
tidyr::pivot_wider(df,
names_from = Sample_ID,
values_from = VALUE,
values_fill = 0)
Output:
# Gene_Symbol P1 P2
# <chr> <int> <int>
# 1 BRCA 12253 0
# 2 CAMP 0 42356
Data
df <- read.table(text = "VALUE Gene_Symbol Sample_ID
12253 BRCA P1
42356 CAMP P2
", h = T)
Generally, you could replace the values of NA
in a data frame without apply
using something like this:
df_na <- tidyr::pivot_wider(df, names_from = Sample_ID, values_from = VALUE)
# Gene_Symbol P1 P2
# <chr> <int> <int>
# 1 BRCA 12253 NA
# 2 CAMP NA 42356
df_na[is.na(df_na)] <- 0
# Gene_Symbol P1 P2
# <chr> <int> <int>
# 1 BRCA 12253 0
# 2 CAMP 0 42356