I'm working with nested expressions in R and need some help with writing a recursive function to process these expressions starting with the innermost parentheses first.
Given a string like:
expr <- "(((a OR b) OR c) AND d) OR (e AND f)"
I want to perform the following replacements:
For example:
The string "((a OR b) OR c) AND d) OR (e AND f)" should evaluate to:
"min(1, (min(1, min(1, a + b) + c) * d) + (e * f))"
The challenge is that the expressions can be nested, and I need to handle the innermost parentheses first before moving outward. This is because I am writing functions to take user input for objective functions, and I have R code to dynamically write Rcpp code...
I've considered using regular expressions to find and replace the patterns, but handling the nested structures recursively is where I'm stuck. I've tried a recursive function approach the following:
process_expr <- function(string) {
# Match the innermost parentheses
pattern <- "\\(([^()]*)\\)"
brackets <- gregexpr(pattern, string, perl = TRUE)
matches <- regmatches(string, brackets)
# If there are no more matches, return the string
if (length(matches[[1]]) == 0) {
return(string)
}
# Process each match
for (match in matches[[1]]) {
# Apply the replacements
modified <- gsub("\\(([^()]+) OR ([^()]+)\\)", "min(1, \\1 + \\2)", match)
modified <- gsub("\\(([^()]+) AND ([^()]+)\\)", "(\\1 * \\2)", modified)
# Replace the match in the original string
string <- sub(pattern, modified, string, fixed = TRUE)
}
# Recurse to process the next level of nested parentheses
return(process_expr(string))
}
# Example usage
expr <- "((a OR b) OR c) AND d) OR (e AND f)"
result <- process_expr(expr)
print(result)
The code above doesn't seem to correctly handle the nested parentheses, especially when there are multiple levels of nesting. I need the function to:
All the solutions I've looked at work for Python or Perl, but not for R.
Is there another approach that I can use to get what I need? i.e.,
Example data: "(((a OR b) OR c) AND d) OR (e AND f)"
Needs to evaluate to:
min(1, (min(1, min(1, a + b) + c) * d) + (e * f))
EDIT:
Here are a couple of other I/O examples:
input_2 <- "(r1 && r2) || r3 + r4 + (r5 && r6)"
output_2: "min(1, (r1 * r2) + r3) + r4) + (r6 * r7)"
Here are a couple of other I/O examples:
input_3 <- "(r1 && r2) || (r3 && r4) + r5 || (r6 + r7)"
output_3: "min(1, (r1 * r2) + (r3 * r4)) + min(1, r5 + (r6 + r7))"
If we translate the input to valid R code, we can use the parse and to the substitution. For example this helper function does the trick
transform <- function(expr) {
tx <- function(e) {
if (is.call(e)) {
parts <- as.list(e)
if (parts[[1]] == quote(`%OR%`)) {
parts[[1]] <- quote(`min`)
parts[[3]] <- bquote(.(tx(parts[[2]])) + .(tx(parts[[3]])))
parts[[2]] <- 1
as.call(parts)
} else if (parts[[1]]==quote(`%AND%`)) {
parts[[1]] <- quote(`*`)
parts[[2]] <- tx(parts[[2]])
parts[[3]] <- tx(parts[[3]])
as.call(parts)
} else {
parts[-1] <- lapply(parts[-1], tx)
as.call(parts)
}
} else {
e
}
}
fix <- gsub("\\bOR\\b", "%OR%", gsub("\\bAND\\b", "%AND%", expr))
fix <- gsub("\\|\\|", "%OR%", gsub("\\&\\&", "%AND%", fix))
deparse1(tx(str2lang(fix)))
}
Which we can test with
expr <- "(((a OR b) OR c) AND d) OR (e AND f)"
transform(expr)
# [1] "min(1, ((min(1, (min(1, a + b)) + c)) * d) + (e * f))"
input_2 <- "(r1 && r2) || r3 + r4 + (r5 && r6)"
transform(input_2)
# [1] "min(1, (r1 * r2) + r3) + r4 + (r5 * r6)"
input_3 <- "(r1 && r2) || (r3 && r4) + r5 || (r6 + r7)"
transform(input_3)
# [1] "min(1, (r1 * r2) + (r3 * r4)) + min(1, r5 + (r6 + r7))"
We turn OR and AND into %OR%
and %AND%
which can be parsed by R and then walk the abstract syntax tree and do the transformation. The parse does the hard work of building the tree.