Given a string like below, remove any leading and trailing punctuation via regular expressions:
String a = "!?Don't.;, .:delete !the@ $actual string%";
String b = "Hyphenated-words, too!";
I know that the regex [\P{Alnum}] will target all non-alphanumeric characters, but how do I target ONLY the leading and trailing punctuation so I get...
a = "Don't delete the actual string";
b = "Hyphenated-words too";
... instead of:
a = "Dont delete the actual string";
b = "Hyphenated words too";
I just need the regular expression; not the actual code to remove the punctuation.
You want to match punctuation that is adjacent to a) a whitespace character OR b) the beginning or end.
your pattern preceded by (?<=^|\s)
positive lookbehind, or
your pattern followed by (?=\s|$)
positive lookahead
To shorten the pattern, we could reword this a little bit to say that our punctuation block must either a) not preceded by some character that's not a whitespace or b) not followed by a character that's not a whitespace.
your pattern preceded by (?<!\S)
negative lookbehind, or
your pattern followed by (?!\S)
negative lookahead
As a final note, you should use \p{Punct}
instead of [\P{Alnum}]
to match punctuation. See the comment by sln for details.
Here is an example usage:
String a = "!?Don't.;, .:delete !the@ $actual string%";
String b = "Hyphenated-words, too!";
String regex = "(?:(?<!\\S)\\p{Punct}+)|(?:\\p{Punct}+(?!\\S))";
System.out.println(a.replaceAll(regex, ""));
System.out.println(b.replaceAll(regex, ""));
Output:
Don't delete the actual string
Hyphenated-words too