I have a data frame df, in which one of the column is as mentioned below. I want to split the V1 column to multiple column as mentioned in the desired output.
df = read.table(text = " parameter_1_day_10
parameter_1_day_3
parameter_2_last_day")
desired output
df2 = read.table(text = " 1 day_10
1 day_3
2 last_day")
You can do this with separate_wider_regex()
from tidyr
. The syntax of the regex is a bit strange. Essentially, you identify the parts of the string that will get turned into variables by naming those parts in the regex in a vector. So, the regex that gets used is the one below:
c("parameter_", c1="[^_]*", "_", c2=".*")
because "parameter_"
isn't named, it will to show up as one of the columns. anything that is not an underscore ([^_]*
) directly after "parameter_"
until the next underscore will go into the variable named c1
. The following underscore will not be saved and then everything after that underscore is saved as c2
.
library(dplyr)
library(tidyr)
df = read.table(text = " parameter_1_day_10
parameter_1_day_3
parameter_2_last_day")
df %>%
separate_wider_regex(V1, c("parameter_", c1="[^_]*", "_", c2=".*"))
#> # A tibble: 3 × 2
#> c1 c2
#> <chr> <chr>
#> 1 1 day_10
#> 2 1 day_3
#> 3 2 last_day
Created on 2024-04-03 with reprex v2.0.2