Search code examples
rcsvsortingrowcolumnsorting

Sort rows based on columns?


I have a csv in R which has the following table:

df1

Start Efficiency Start Efficiency End
9955 0 9955 25 5848
474909 0 474909 35 3948
90342 25 84384
94209 55 99393

I want to replace the false zero efficiencies corresponding to the start column with the correct efficiencies so that my final result can look like this:

df2

Start Efficiency End
9955 25 5848
474909 35 3948
90342 25 84384
94209 55 99393
df1 <- data.frame(
  Start = c("9955", "474909", "90342", "94209"),
  Efficiency = c("0", "0", "25", "55"), 
  Start = c("9955", "474909"),
  Efficiency = c("25", "35"), 
  End = c("5848", "3948", "84384", "99393"), stringsAsFactors = F)
df2 <- data.frame(
  Start = c("9955", "474909", "90342", "94209"),
  Efficiency = c("25", "35", "25", "55"),
  End = c("5848", "3948", "84384", "99393"), stringsAsFactors = F)

Solution

  • R doesn't like columns with same name hence the duplicate column names become Start.1 and Efficiency.1. You can replace the 0 value with NA and use coalesce.

    library(dplyr)
    
    df1 %>%
      mutate(across(contains('Efficiency'), na_if, 0)) %>%
      transmute(Start, Efficiency = coalesce(Efficiency, Efficiency.1))
    
    #   Start Efficiency
    #1   9955         25
    #2 474909         35
    #3  90342         25
    #4  94209         55