Search code examples
javaregexunicodenormalizationunicode-normalization

How to replace Characters like "á" with corresponding english alphabet


I have a sample String like á, é, í, ó, ú, ü, ñ and I want to replace the special characters, for example :
á with a
é with e
and so on..

I have a map where I have special character as key and its corresponding replacement as value.
Now suppose I'll pass a String "novás músíc" into method where a regex will validate it and if any special char is found (the one which I mentioned) then it should be replaced with the mapped char.

Please help me with regex validation part.


Solution

  • You can do this via Unicode normalization, followed by a regular expression to remove the ligature marks.

    See this question and its accepted answer: "Convert Unicode to ASCII without changing the string length (in Java)"