Search code examples
regexregexp-replace

Match and remove all spaces inside brackets pattern


I want to match all spaces that are inside every [[]] in a string so I could use a replaceAll method and remove them.

Example input: text text [[ ia asd ]] [[asdasd]] dfgd dfaf sddgsd [[sss aaa]]

Expected output: text text [[iaasd]] [[asdasd]] dfgd dfaf sddgsd [[sssaaa]]

I thought of this: \[\[(\s*?)\]\] which should match all spaces that are between double brackets but it doesn't match anything.

I also tried several other solutions to similar problems but non seemed to work.

Any clue what else could be used?


Solution

  • Considering it is Java, you can use

    String result = text.replaceAll("(\\G(?!^)|\\[\\[)((?:(?!]]).)*?)\\s+(?=.*?]])", "$1$2")
    

    Or, another approach is matching all substrings between [[ and ]] and then removing any whitespace inside the matches:

    String text = "text text [[ ia asd ]] [[asdasd]] dfgd dfaf sddgsd [[sss aaa]]";
    Pattern p = Pattern.compile("\\[\\[.*?]]");
    Matcher m = p.matcher(text);
    StringBuffer buffer = new StringBuffer();
    while(m.find()) {
        m.appendReplacement(buffer, m.group().replaceAll("\\s+", ""));
    }
    m.appendTail(buffer);
    System.out.println(buffer.toString());
    

    See the Java demo online.

    The first regex means:

    • (\G(?!^)|\[\[) - Group 1 ($1): either [[ or the end of the preceding successful match
    • ((?:(?!]]).)*?) - Group 2 ($2): any char other than line break chars, zero or more but as few as possible occurrences, that does not start a ]] char sequence
    • \s+ - one or more whitespaces
    • (?=.*?]]) - immediately to the right, there must be any zero or more chars other than line break chars, as few as possible, and then ]].