I have a csv in R which has the following table:
df1
Start | Efficiency | Start | Efficiency | End |
---|---|---|---|---|
9955 | 0 | 9955 | 25 | 5848 |
474909 | 0 | 474909 | 35 | 3948 |
90342 | 25 | 84384 | ||
94209 | 55 | 99393 |
I want to replace the false zero efficiencies corresponding to the start column with the correct efficiencies so that my final result can look like this:
df2
Start | Efficiency | End |
---|---|---|
9955 | 25 | 5848 |
474909 | 35 | 3948 |
90342 | 25 | 84384 |
94209 | 55 | 99393 |
df1 <- data.frame(
Start = c("9955", "474909", "90342", "94209"),
Efficiency = c("0", "0", "25", "55"),
Start = c("9955", "474909"),
Efficiency = c("25", "35"),
End = c("5848", "3948", "84384", "99393"), stringsAsFactors = F)
df2 <- data.frame(
Start = c("9955", "474909", "90342", "94209"),
Efficiency = c("25", "35", "25", "55"),
End = c("5848", "3948", "84384", "99393"), stringsAsFactors = F)
R doesn't like columns with same name hence the duplicate column names become Start.1
and Efficiency.1
. You can replace the 0 value with NA
and use coalesce
.
library(dplyr)
df1 %>%
mutate(across(contains('Efficiency'), na_if, 0)) %>%
transmute(Start, Efficiency = coalesce(Efficiency, Efficiency.1))
# Start Efficiency
#1 9955 25
#2 474909 35
#3 90342 25
#4 94209 55