Assume I have a string like below:
String param = "[\"\\n\",\"\\t\",\"'\",\"\\\"\",\"\\\\\"]"
The output of System.out.println is:
"\n","\t","'","\"","\\"
I would like to replace double quotes which doesn't have a backslash ahead, or, in another word, I would like to have the System.out.println output like below:
\n,\t,',\",\\
So I used this pattern:
System.out.println(param.replaceAll("\\\\{0}\"", ""));
But I got this:
\n,\t,',\,\\
As you can see, the double quote with a backslash ahead is also replaced. How can I prevent it from being replaced?
Edit: Sorry about the square brackets. You may ignore them cause they have nothing to do with this question
You can use the following regex to match and remove "
that are string literal qualifiers:
(?s)(?<!\\)((?:\\{2})*)"([^"\\]*(?:\\.[^"\\]*)*)"
See the regex demo.
Details
(?s)
- DOTALL modifier (just in case the string literal can span across lines)(?<!\\)
- no \
immediately to the left of the current location((?:\\{2})*)
- Group 1: any 0+ conseuctive occurrences of 2 backslashes"
- a double quote (string literal start)([^"\\]*(?:\\.[^"\\]*)*)
- Group 2:
[^"\\]*
- any 0+ chars other than \
and "
(?:\\.[^"\\]*)*
- 0+ sequences of
\\.
- a \
followed with any char[^"\\]*
- any 0+ chars other than \
and "
"
- a closing string literal double quoteSee the Java demo:
String param = "[\"\\n\",\"\\t\",\"'\",\"\\\"\",\"\\\\\",\"\\\\\\\"\"]";
System.out.println(param);
// => ["\n","\t","'","\"","\\","\\\""]
String regex = "(?s)(?<!\\\\)((?:\\\\{2})*)\"([^\"\\\\]*(?:\\\\.[^\"\\\\]*)*)\"";
param = param.replaceAll(regex, "$1$2");
System.out.println(param);
// => [\n,\t,',\",\\,\\\"]