Search code examples
javaregexreadability

Is there another way to do a regex without a String escaping all characters?


I have this line of code to remove some punctuation:

str.replaceAll("[\\-\\!\\?\\.\\,\\;\\:\\\"\\']", "");

I don't know if all the chars in this regex need to be escaped, but I escaped only for safety.

Is there some way to build a regex like this in a more clear way?


Solution

  • Inside [...] you don't need to escape the characters. [.] for instance wouldn't make sense anyway!

    The exceptions to the rule are

    • ] since it would close the whole [...] expression prematurely.
    • ^ if it is the first character, since [^abc] matches everything except abc.
    • - unless it's the first/last character, since [a-z] matches all characters between a to z.

    Thus, you could write

    str.replaceAll("[-!?.,;:\"']", "")
    

    To quote a string into a regular expression, you could also use Pattern.quote which escapes the characters in the string as necessary.

    Demo:

    String str = "abc-!?.,;:\"'def";
    System.out.println(str.replaceAll("[-!?.,;:\"']", "")); // prints abcdef