Search code examples
rdataframesplitseparatortidyr

Split data frame column when square brackets


I have a data frame with some model estimations. Depending on the observation the estimation has just a value or a value together with a confidence interval between square brackets. By the way, the variable is a character (I guess that I need to change it some-when)

df<-data.frame(c("5","3","8 [3 - 5]")

I would like to split this data frame column (x) into two columns. A first one for the estimated values (y) and a second one for the confidence interval with or without brackets (z).

I have tried with tidyr::separate and tidyr::split (I am big fun of the dplyr family:-), but I do not get the wished result.

tidyr::separate(col=x,into=c("y","z"),sep="//[")

Do you know what I am doing wrong?


Solution

  • This can be done with extract

    library(tidyr)
    extract(df, x, into = c("y", "z"), "(\\d+)\\s*(.*)")
    

    Or use the extra argument in separate

    separate(df, x, into = c("y", "z"), "\\s+", extra = "merge")
    

    data

    df <- data.frame(x= c("5","3","8 [3 - 5]"))