I'd like to normalize any extended ascii characters, but exclude umlauts.
If I'd like to include umlauts, I would go for:
Normalizer.normalize(value, Normalizer.Form.NFKD)
.replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
But how can I exclude german umlauts?
As a result I would like to get:
source: üöäâÇæôøñÁ
desired result: üöäaCaeoonA
or similar
From here I see 2 solutions, the first one is quite dirty the second is quite boring to implement I guess.
Remove from the string you want to normalize the characters with umlauts, then after normalization put them back.
Don't use the pre-buit pattern p{InCombiningDiacriticalMarks}
. Instead build your own one excluding umlaut.
Take a look at :