I tried to delete all the [.!?]
from quotes in a text and doing so , I want first to catch all my quotes including [.!?]
with a regex to delete them after that.
My regex doesn't work, maybe because it's greedy. It takes from my "«" (character at index 569) to the last character which is another "»" (character at index 2730).
My regex was:
Pattern full=Pattern.compile("«.*[.!?].*?»");
Matcher mFull = full.matcher(result);
while(mFull.find()){
System.out.println(mFull.start()+" "+mFull.end());
}
So I got:
569 2731
Also , Same problem of greediness , with catching sentences ( beginning with any [A-Z] and ending with any [.!?].
You may use
s = s.replaceAll("(\\G(?!^)|«)([^«».!?]*)[.!?](?=[^«»]*»)", "$1$2");
See the regex demo
Details
(\G(?!^)|«)
- Group 1 (whose value is referred to with $1
from the replacement pattern): either the end of the previous match or «
([^«».!?]*)
- Group 2 ($2
): any 0+ chars other than «
, »
, !
, .
and ?
[.!?]
- any of the three symbols(?=[^«»]*»)
- there must be a »
after 0 or more chars other than «
and »
immediately to the right of the current location.