Search code examples
rreplacenlpcasestringr

Combination Regmatches and Replacement for specific Character


I've tried replace character which match with specific character or followed by "BT", but my codes failed. This is my codes:

df <- data.frame(
  exposure = c("123BT", "113BB", "116BB", "117BT")
)

df %>%
  mutate(
    exposure2 = case_when(exposure == regmatches("d+\\BT") ~ paste0("-", exposure),
                     TRUE ~ exposure)
  )

the error is:

Error: Problem with `mutate()` column `exposure2`.
i `exposure2 = case_when(...)`.
x argument "m" is missing, with no default
Run `rlang::last_error()` to see where the error occurred.

Whereas my target is:

df <- data.frame(
  exposure = c("123BT", "113BB", "116BB", "117BT"),
exposure2 = c(-123, 113, 116, -117)
)

Solution

  • I recommend you use library stringr, you can extract your numbers with regex (\\d)+:

    library(stringr)
    library(dplyr)
    
    df %>%
      mutate(
        exposure2 = case_when(str_detect(exposure,"BT") ~ paste0("-", str_extract(exposure, "(\\d)+")),
                              TRUE ~ str_extract(exposure, "(\\d)+"))
      )
    

    Output:

      exposure exposure2
    1    123BT      -123
    2    113BB       113
    3    116BB       116
    4    117BT      -117
    

    If you still prefer use regmatches you can get same result with:

    df %>%
      mutate(
        exposure2 = case_when(exposure %in% regmatches(exposure, regexpr("\\d+BT", exposure)) ~ paste0("-", regmatches(exposure, regexpr("\\d+", exposure))),
                              TRUE ~ regmatches(exposure, regexpr("\\d+", exposure)))
      )