The confidence interval column is of type character
confidence_interval |
---|
(245.0 - 345.2) |
(434.1 - 432.1) |
(123.5 - 1,120.2) |
I want to create two numeric columns like Upper Interval which has first value in the parentheses and lower interval which contains the second value
Upper Interval | Lower Interval |
---|---|
245.0 | 345.2 |
434.1 | 432.1 |
123.5 | 1120.2 |
How can this be done using R?
Thanks
Here is a solution.
ci <- c('(245.0,345.2)', '(434.1,432.1)', '(123.5,901.2)')
values <- strsplit(gsub('\\(|\\)', '', ci), split = ",")
upper <- sapply(values, function(x) as.numeric(x[[1]]))
lower <- sapply(values, function(x) as.numeric(x[[2]]))
upper
#> [1] 245.0 434.1 123.5
lower
#> [1] 345.2 432.1 901.2
I use gsub
to remove the parentheses, and then strsplit
to split the values of each side of the ,
. Then i use sapply
to return this a vector as the return value of strsplit
is a list of lists.
OP question was edited
If separator between value is is ' - ' then you should use values <- strsplit(gsub('\\(|\\)', '', ci), split = " - ")
The split
parameter in strsplit
is what the function will use to split the strings into two parts.