Split salary range

I am trying to split the first column range into two separate columns as so

I have tried the following code:

testDF$IncomeLowerRange <- strsplit(gsub("[^-]+-", "", testDF$IncomeRange), ";")[1]
testDF$IncomeUpperRange <- strsplit(gsub("[^-]+-", "", testDF$IncomeRange), ";")[2]

Solution

I do not think there is a one-liner for this as you need to extract some of the data, replace fully or partially in other cases.

I suggest using a simple logic for this:

> test <- c("$25,000-49,000","Not displayed", "$100,000+")
> df <- data.frame(test)
> df$col1 <- sub("^\\$?([0-9,]+).*", "\\1", df$test)
> df$col1[df$col1=="Not displayed"] <- "NA"
> df$col2 <- sub("^[^-]+-(.*)", "\\1", df$test)
> df$col2[df$col2=="Not displayed"] <- "NA"
> df$col2[df$col2=="$100,000+"] <- "250,000"
> 
> df
            test    col1    col2
1 $25,000-49,000  25,000  49,000
2  Not displayed      NA      NA
3      $100,000+ 100,000 250,000

where col1 will be your IncomeLowerRange and col2 will be your IncomeUpperRange.

The ^\\$?([0-9,]+).* regex is used to extract the first number from a range, and also 100,000 into the first column (^ matches the start of string, \$? matches 1 or 0 $ symbols, ([0-9,]+) captures digits and commas into Group 1 that is kept, and the rest (also, all after the number - .*) is removed.