Search code examples
javaregexreplaceall

Regex to remove whitespace in between quotes but not between words inside the quotes?


I am programming in Java.

I am struggling to transform this:

Text0 Text1 " Text2 Text3 Text4     "   Text5 Text6

into this:

Text0 Text1 "Text2 Text3 Text4" Text5 Text6

I have tried lookaheads and lookbehinds:

(?<=\")\s+(\w*\s*\w*)\s+(?=\")

manages to match all the text inside the quotes, but when switching to:

(?<=\")\s+(\W*\S*\W*)\s+(?=\")

I get an error. Not sure why.

My short knowledge of regex limits me. Help would be appreciated.


Solution

  • It's easier not to use (just) regex.

    Split the string on quotes (-1 to keep any trailing empty parts):

    String[] parts = str.split("\"", -1);
    

    Trim the odd-numbered elements:

    for (int i = 1; i < parts.length; i += 2) {
      parts[i] = parts[i].trim();
    }
    

    Join the parts again:

    String newStr = String.join("\"", parts);