Take this word in Arabic: مرَّة
This could be written with this sequence of Unicode characters:
/*
U+0645 # ARABIC LETTER MEEM
U+0631 # ARABIC LETTER REH
U+0651 # ARABIC SHADDA
U+064e # ARABIC FATHA
U+0629 # ARABIC LETTER TEH MARBUTA
*/
console.log("\u0645\u0631\u0651\u064e\u0629")
or as this sequence (same letters, but the order of FATHA and SHADDA are swapped):
/*
U+0645 # ARABIC LETTER MEEM
U+0631 # ARABIC LETTER REH
U+064e # ARABIC FATHA
U+0651 # ARABIC SHADDA
U+0629 # ARABIC LETTER TEH MARBUTA
*/
console.log("\u0645\u0631\u064e\u0651\u0629")
They are both rendered the same. Are they both considered correct? Is one considered preferable to the other?
I'm not sure if one is considered correct. However, it is interesting to me that normalising the Unicode always results in the shadda being placed second, for the normalisation forms NFC, NFKC, NFD, NFKD.
Take a look at this Python code:
>>> shadda_first = "\u0645\u0631\u0651\u064e\u0629"
>>> shadda_second = "\u0645\u0631\u064e\u0651\u0629"
>>> shadda_second == shadda_first
False
>>> shadda_first
'مرَّة'
>>> shadda_second
'مرَّة'
>>> import unicodedata
>>> unicodedata.normalize("NFC", shadda_second)
'مرَّة'
>>> unicodedata.normalize("NFC", shadda_second) == shadda_second
True
>>> unicodedata.normalize("NFC", shadda_first) == shadda_second
True
>>> unicodedata.normalize("NFKC", shadda_second) == shadda_second
True
>>> unicodedata.normalize("NFKC", shadda_first) == shadda_second
True
>>> unicodedata.normalize("NFD", shadda_second) == shadda_second
True
>>> unicodedata.normalize("NFD", shadda_first) == shadda_second
True
>>> unicodedata.normalize("NFKD", shadda_second) == shadda_second
True
>>> unicodedata.normalize("NFKD", shadda_first) == shadda_second
True
(Please note that Stack Overflow makes it hard to see the KASRA diacritic in the monospace output. Copy and paste the answer in your favourite text editor to see the diacritic in مرَّة.)