I found out a regex pattern to remove all non alphabet letters: \p{L}
I thus did a regex to remove all non alphabet, non digit and non underscore pattern : /[^\p{L}\d_]/gimu
Unfortunately, it does not work with a hindi character like #फ्रांस
which gives फरस
See for yourself here https://regex101.com/r/dnXDK0/1
And please help me :-)
You forgot about diacritics. You need to add \p{M}
or \p{Mn}
into the negated character class:
/[^\p{L}\p{M}\d_]/gu
See the regex demo.
Note you do not need the i
and m
flags here. m
redefines anchor behavior, but your regex contains no ^
nor $
. i
makes caseful letters match in a case insensitive way, but \p{L}
matches all letters, upper- and lowercase ones.