How can I detect the codepage of a serial of text,2 byte for a charactor,It's polish.And for normal English charactor ,just add 0x00 to the ansi code, for special Polish character,the two byte have the special meaning. there is no file head ,just bytes stream like this.
Sample here
string: Połączenia
bytes: 50 00/6f 00/42 01/05 01/63 00/7a 00/65 00/69 00/61 00
I think it's not unicode ,because 0x4201 in unicode is a Chinese charactor not Polish.
So Any one can help me? thanks very much!
Its UTF-16 Big Endian.
$ echo -n "Połączenia" | iconv -f UTF8 -t UTF16BE | hexdump
0000000 5000 6f00 4201 0501 6300 7a00 6500 6e00
0000010 6900 6100