Search code examples
rdplyrwarnings

R dplyr reframe warning vs error based on dataset


I have a dataset where I need to find the value above AND below a number (1.1 in this example). This is what I have done:

df1 <- data.frame(da=c(2.9245,2.8585,2.5225,2.0145,1.1715,0.5075,0.2645,0.0915), 
                  d2=c(2.912375,2.703375,2.502375,2.025375,1.110375,0.535375,0.243375,0.072375),
                  Blank...25=c(0.058,0.060,0.059,0.080,0.059,0.066,0.085,0.105), 
                  dilution=c(1:8))

titers_long2 <- df1 |>
  mutate(dilution = dilution) |>
  pivot_longer(-dilution)

titers2 <- titers_long2|>
  group_by(name) |>
  reframe(belowt = value[which(value == max(value[value<=1.1]))],
          abovet = value[which(value == min(value[value >=1.1]))],
          belowdil = dilution[which(value == max(value[value<=1.1]))],
          abovedil = dilution[which(value == min(value[value >=1.1]))]
  )

This works with the caveat that it gives me a warning for the names where the criteria are not met (Blank...25 in this case, where all values are <1.1).

However, the code behaves differently if I use a data frame with a different name for my Blank (in this case Blank...37), as follows:

df2 <- data.frame(da=c(2.9245,2.8585,2.5225,2.0145,1.1715,0.5075,0.2645,0.0915), 
                  d2=c(2.912375,2.703375,2.502375,2.025375,1.110375,0.535375,0.243375,0.072375),
                  Blank...37=c(0.054,0.060,0.057,0.085,0.078,0.067,0.063,0.085), 
                  dilution=c(1:8))

titers_long2 <- df2 |>
  mutate(dilution = dilution) |>
  pivot_longer(-dilution)

titers2 <- titers_long2|>
  group_by(name) |>
  reframe(belowt = value[which(value == max(value[value<=1.1]))],
          abovet = value[which(value == min(value[value >=1.1]))],
          belowdil = dilution[which(value == max(value[value<=1.1]))],
          abovedil = dilution[which(value == min(value[value >=1.1]))]
  )

Running the code with df2 gives me an error:

Error in `reframe()`:
! Can't recycle `abovet = value[which(value == min(value[value >= 1.1]))]`.
ℹ In group 1: `name = "Blank...37"`.
Caused by error:
! `abovet` must be size 2 or 1, not 0.
ℹ An earlier column had size 2.

Why does the code behave differently with df1 and df2 and how can I fix it?


Solution

  • This works for all data frames, reusing already calculated min/max values

    df1 %>% 
      pivot_longer(-dilution) %>% 
      summarize(bval = max(value[which(value <= 1.1)]), 
                aval = min(value[which(value > 1.1)]), 
                bdil = max(dilution[bval == value]), 
                adil = min(dilution[aval == value]), .by = name) %>% 
      suppressWarnings()
    # A tibble: 3 × 5
      name        bval   aval  bdil  adil
      <chr>      <dbl>  <dbl> <int> <dbl>
    1 da         0.507   1.17     6     5
    2 d2         0.535   1.11     6     5
    3 Blank...25 0.105 Inf        8   Inf
    
    df2 %>% 
      pivot_longer(-dilution) %>% 
      summarize(bval = max(value[which(value <= 1.1)]), 
                aval = min(value[which(value > 1.1)]), 
                bdil = max(dilution[bval == value]), 
                adil = min(dilution[aval == value]), .by = name) %>% 
      suppressWarnings()
    # A tibble: 3 × 5
      name        bval   aval  bdil  adil
      <chr>      <dbl>  <dbl> <int> <dbl>
    1 da         0.507   1.17     6     5
    2 d2         0.535   1.11     6     5
    3 Blank...37 0.085 Inf        8   Inf
    

    Using sort instead of max avoids warnings and nicely puts NA but is a bit slower on huge data sets.

    df1 %>% 
      pivot_longer(-dilution) %>% 
      summarize(bval = last(sort(value[which(value <= 1.1)])), 
                aval = first(sort(value[which(value > 1.1)])), 
                bdil = last(sort(dilution[bval == value])), 
                adil = first(sort(dilution[aval == value])), .by = name)
    # A tibble: 3 × 5
      name        bval  aval  bdil  adil
      <chr>      <dbl> <dbl> <int> <int>
    1 da         0.507  1.17     6     5
    2 d2         0.535  1.11     6     5
    3 Blank...25 0.105 NA        8    NA