I want/need a test case for testing/breaking conversions between UTF-32 and UTF-16.
For UTF-8 and UTF-16, I generally use the 'Chinese Bone' test: 0xE9 0xAA 0xA8 (UTF8) and 0x9AA8 (UTF16).
Does anyone have a negative test case that should break a poorly written implementation for UTF-16 and UTF-32? Ideally, the test will require use of at least two UTF-32 values.
Jeff
Not sure what you mean, here are some:
UTF-16
\xD8\x00\x00\x00
or \xD8\x00\xDB\xFF
\x00\x61\xDC\00
\xDF\xFF\xDB\xFF
\xD8\x01<EOF>
'\xD8\x00\xDC'.decode('utf-16be')
UTF-32
value < 0
, value > 0x10FFFF
or 0xD800 <= value && value <= 0xDFFF