Search code examples
rdataframemultiple-columns

Create two numeric columns from One character column in R


The confidence interval column is of type character

confidence_interval
(245.0 - 345.2)
(434.1 - 432.1)
(123.5 - 1,120.2)

I want to create two numeric columns like Upper Interval which has first value in the parentheses and lower interval which contains the second value

Upper Interval Lower Interval
245.0 345.2
434.1 432.1
123.5 1120.2

How can this be done using R?

Thanks


Solution

  • Here is a solution.

    ci <- c('(245.0,345.2)', '(434.1,432.1)', '(123.5,901.2)')
    
    values <- strsplit(gsub('\\(|\\)', '', ci), split = ",")
    
    upper <- sapply(values, function(x) as.numeric(x[[1]]))
    lower <- sapply(values, function(x) as.numeric(x[[2]]))
    
    upper
    #> [1] 245.0 434.1 123.5
    lower
    #> [1] 345.2 432.1 901.2
    

    I use gsub to remove the parentheses, and then strsplit to split the values of each side of the ,. Then i use sapply to return this a vector as the return value of strsplit is a list of lists.

    OP question was edited

    If separator between value is is ' - ' then you should use values <- strsplit(gsub('\\(|\\)', '', ci), split = " - ")

    The split parameter in strsplit is what the function will use to split the strings into two parts.