Search code examples

Set a conditional divide function with NA values present

I have a snippet of a larger data set that I am trying convert to the same unit of measurement. I can convert the columns "unit1" and "legal1" to "ppm" units just fine, but I run into trouble when trying to convert "unit2" and "legal2" to "ppm" units.

When the unit is "ppb" in the corresponding columns, I need the values in "legal" columns to divide by 1000.

Here is my progress so far:

contaminants <- c("Barium", "Magnanese", "Nitrate", "Nitrate & nitrite")
unit1 <- c("ppb", "ppb", "ppm", "ppm")
legal1 <- c(2000, 50, 10, 10)
legal2 <- c(2, 999999, 10, NA)
unit2 <- c("ppm", "ppb", "ppm", NA)

testdf = data.frame(contaminants, unit1, legal1, unit2, legal2)

I can successfully convert the "unit1" and "legal1" to "ppm" units with

testdf$legal1[testdf$unit1 == "ppb"] <- (testdf$legal1)/1000
testdf$unit1[testdf$unit1 == "ppb"] <- "ppm"

When I try to run the same code for "unit2" and "legal2"

testdf$legal2[testdf$unit2 == "ppb"] <- (testdf$legal2)/1000

I get the following error: "NAs are not allowed in subscripted assignments"

In this case, only the row with 'Magnanese' would need to be divided by 1000, but that's not happening.


  • Here is another tidyverse solution. The issue is cause by the NA we could address it by ! within across. The challenge is to fulfill the condition to be ppb to calculate:

    testdf %>%
                    ~ if_else(get(str_replace(cur_column(), "legal", "unit")) == "ppb" & !, . / 1000, .),
                    .names = "{.col}"),
             across(starts_with("unit"), ~ifelse(. == "ppb", "ppm", .)))

    OR in base R with a custom function:

    # custom function 
    transform_fun <- function(df, legal_col, unit_col) {
      ifelse(df[[unit_col]] == "ppb" & ![[legal_col]]), df[[legal_col]] / 1000, df[[legal_col]])
    testdf$legal1 <- transform_fun(testdf, "legal1", "unit1")
    testdf$unit1 <- ifelse(testdf$unit1 == "ppb", "ppm", testdf$unit1)
    testdf$legal2 <- transform_fun(testdf, "legal2", "unit2")
    testdf$unit2 <- ifelse(testdf$unit2 == "ppb", "ppm", testdf$unit2)
           contaminants unit1 legal1 unit2  legal2
    1            Barium   ppm   2.00   ppm   2.000
    2         Magnanese   ppm   0.05   ppm 999.999
    3           Nitrate   ppm  10.00   ppm  10.000
    4 Nitrate & nitrite   ppm  10.00  <NA>      NA