Search code examples
javaregexstringreplaceall

Remove punctuation from string


I have a string and I need to remove these symbols: -- + [ ] { } ( ) \ /

For example:

    String clean = "This \ is / an example. This -- is + an [(example)].";

    clean = clean.replaceAll("[/[()/]]", "");
    clean = clean.replaceAll("/-/-", "");

    clean = clean.replaceAll("\\/","");
    clean = clean.replaceAll("\\\\", " ");
    clean = clean.replaceAll("\\+", "");

    return clean.replaceAll("[ ]+", " ").trim();

My output should be: This is an example. This is an example.

My code does not remove everything I need and also I would like to know if there is a shorter way to do this.

--

Just some particularities I should mention: - should be removed only if there are two together. / should be replaced by a whitespace. I'm going to try to adapt your solutions here. Thanks.


Solution

  • You can simply call the String.replaceAll method and specify that those characters must be replaced by the empty String:

    clean = clean.replaceAll("(?:--|[\\[\\]{}()+/\\\\])", "");
    

    But if you need to do this many times, it's worth creating a Pattern object so that the regex does not have to be compiled repeatedly:

    private static final Pattern UNWANTED_SYMBOLS =
            Pattern.compile("(?:--|[\\[\\]{}()+/\\\\])");
    

    Now you can use this to create a Matcher object and use that to do the replacement:

    Matcher unwantedMatcher = UNWANTED_SYMBOLS.matcher(clean);
    clean = unwantedMatcher.replaceAll("");
    

    This should be more efficient if you need to use the replacement in a loop which runs more than a few times.