I want to separate the complex names in my df1 after the second "_". How can I incorporate it into tidyR?
library(tidyverse)
df1 <- tibble(complex_names=c("King_Arthur_II", "Queen_Elizabeth_I", "King_Charles_III"),
year=c(970,1920,2022)
)
df1
#> # A tibble: 3 × 2
#> complex_names year
#> <chr> <dbl>
#> 1 King_Arthur_II 970
#> 2 Queen_Elizabeth_I 1920
#> 3 King_Charles_III 2022
df1 |>
separate(complex_names,into = c("name", "number"), sep="the second comma")
#> Error in into("name", "number"): could not find function "into"
Created on 2022-09-27 with reprex v2.0.2
I want my data to look like this:
name number year
King_Arthur II 970
...
I'm no expert in regex but this answer shows the regular expression to find the second underscore. You can then use this regular expression in separate()
:
library(tidyverse)
df1 <- tibble(complex_names=c("King_Arthur_II", "Queen_Elizabeth_I", "King_Charles_III"),
year=c(970,1920,2022)
)
df1
#> # A tibble: 3 × 2
#> complex_names year
#> <chr> <dbl>
#> 1 King_Arthur_II 970
#> 2 Queen_Elizabeth_I 1920
#> 3 King_Charles_III 2022
df1 |>
separate(complex_names, into = c("name", "number"), sep = "(_)(?=[^_]+$)")
#> # A tibble: 3 × 3
#> name number year
#> <chr> <chr> <dbl>
#> 1 King_Arthur II 970
#> 2 Queen_Elizabeth I 1920
#> 3 King_Charles III 2022
Created on 2022-09-27 with reprex v2.0.2