I was trying to break string in tokens with + = == <= >= != || { } when they occur outside double quotes. But it is tokenizing with single occurrence of | < > !. That is not required. So how to handle it?
String line1= "sa2dvf=s||a|df&&v<gdsf==ds!gv!=fdgv\"fvdsvg=kjhbhbj==\"";
String regex = "[\\{\\}+={!=}{<=}{>=}{||}](?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)";
String[] tokens = line1.split(regex, -1);
for(String val : tokens) {
System.out.println(val);
}
And it's output is:
sa2dvf
s
a
df&&v
gdsf
ds
gv
fdgv"fvdsvg=kjhbhbj=="
But required is:
sa2dvf
s
a|df&&v<gdsf
ds!gv
fdgv"fvdsvg=kjhbhbj=="
You can use this lookahead regex for splitting:
String[] arr = str.split("(?:[<>=!]=|\\|\\||[+=\\{}])(?=(?:(?:[^\"]*\"){2})*[^\"]*$)");
RegEx Breakup:
(?:[<>=!]=|\\|\\||[+=\\{}])
: Match one of the operators we want to split on(?:[^"]*"){2}
finds a pair of quotes(?:(?:[^"]*"){2})*
finds 0 or more pair of quotes[^"]*$
makes sure we don't have any more quotes after last matched quote
So (?=...)
asserts that we have even number of quotes ahead thus matching symbols outside the quoted string only.