I am trying to make models output prettier with pre-defined labels for my variables. I have a vector of variable names (a), a vector of labels (b) and model terms (c).
I have to match the vectors (a) and (c) and replace (a) by (b). I found this question that introduced me to the function gsubfn
from the package library(gsubfn)
. The function match and replace multiple strings. Following their example, it did not work properly in my case:
library(gsubfn)
a <- c("ecog.ps", "resid.ds", "rx")
b <- c("ECOG-PS", "Residual Disease", "Treatment")
c <- c("ecog.psII", "rxt2", "ecog.psII:rxt2")
gsubfn("\\S+", setNames(as.list(b), a), c)
[1] "ecog.psII" "rxt2" "ecog.psII:rxt2"
If I use a specific pattern, then it works:
gsubfn("ecog.ps", setNames(as.list(b), a), c)
[1] "ECOG-PSII" "rxt2" "ECOG-PSII:rxt2"
So I guess my problem is the regular expression used as the argument pattern in the function gsubfn
. I checked this R-pub, and Hadley's book for regular expressions. It seems that \S+
is adequate. I tried other regular expressions without success:
gsubfn("[:graph:]", setNames(as.list(b), a), c)
[1] "ecog.psII" "rxt2" "ecog.psII:rxt2"
gsubfn("[:print:]", setNames(as.list(b), a), c)
[1] "ecog.psII" "rxt2" "ecog.psII:rxt2"
Which pattern should be used in the function gsubfn
to match the vectors (a) and (c) and replace (a) by (b)?
The \S+
pattern fully matches ecog.psII
and ecog.psII:rxt2
and the list has no items with such names. You may create a pattern dynamically from the a
vector and use it to find the matches you need.
Use
pat <- paste(a, collapse="|")
## Or, if there can be special chars that must be escaped (note . must also be escaped)
pat <- paste(gsub("([][/\\\\^$*+?.()|{}-])", "\\\\\\1", a), collapse="|")
## => ecog\.ps|resid\.ds|rx
and then use
gsubfn(pat, setNames(as.list(b), a), c)
If you do not escape special chars, you may overmatch (since .
matches any char), match wrong strings (if there are quantifiers or other regex operators) or an error might occur (if there are chars like (
, )
, unpaired [
, etc.).