On the XML documentation and on the different implementations of the Mozilla Universal Character Set Detector (UCSD), there appears a BOM sequence where either the byte order or the word order is reversed, but not both, and they call it 'unusual octet order':
F.1 Detection Without External Encoding Information
...
00 00 FF FE UCS-4, unusual octet order (2143)
FE FF 00 00 UCS-4, unusual octet order (3412)
Universal Character Set Detector (UCSD) source (just an example):
if (('\xFF' == aBuf[1]) && ('\x00' == aBuf[2]) && ('\x00' == aBuf[3]))
// FE FF 00 00 UCS-4, unusual octet order BOM (3412)
mDetectedCharset = "X-ISO-10646-UCS-4-3412";
else if (('\x00' == aBuf[1]) && ('\xFF' == aBuf[2]) && ('\xFE' == aBuf[3]))
// 00 00 FF FE UCS-4, unusual octet order BOM (2143)
mDetectedCharset = "X-ISO-10646-UCS-4-2143";
Universal Character Set Detector (UCSD) docs:
Known character sets
...
X-ISO-10646-UCS-4-2143
X-ISO-10646-UCS-4-3412
Is there any hardware in existence that uses this endianness, is there such an encoding or an ISO standard for it, is there any popular libs that support encoding/decoding this? Why aren't these sequences just ignored like any other invalid sequence?
ISO 10646 and Unicode only include big-endian and little-endian UCS-4/UTF-32, not middle-endian. To my knowledge, no software in existence uses these encodings, they are practically irrelevant. Why then does the XML standard mention it? I don't know, but I guess mentioning it was driven by a desire for theoretical completeness rather than any practical value; the same likely applies to character detection/conversion software which includes support for it.
Historically, there have been some systems which have used middle-endian byte order; PDP-11s use the 3412 format to store 32-bit numbers. So if you were to try to process UCS-4/UTF-32 on a PDP-11, the UCS-4-3412 format might be useful. But in practice, no one tries to do that, since PDP-11s were past their heyday by the time Unicode arrived; and since PDP-11s are only 16-bit machines, UCS-4 is not the best Unicode format to use with them.