Search code examples
character-encodinggb2312gbk

Why ISO 2022 defined 94- and 96-?


I have some doubts about the standard ISO 2022:

  • What's the difference between using 94-(0x21 - 0xFE) and using 96-(0x20 - 0xFF)?
  • Why, for example in EUC-CN, code in CS1 only use limited area (94- 96-)? Why doesn't it occupy the all? For compatibility or other reasons?

Looking forward to your replies and thx~


Solution

  • You should have a look at Ken Lunde's book "CJKV Information Processing" (2nd edition).

    Q1: What's the difference between using 94-(0x21 - 0xFE) and using 96-(0x20 - 0xFF)?

    A1: If the answer isn't in the book, you'll have to find the people on the ISO 2022 committee and ask them about their design questions :-)

    Q2: Why, for example in EUC-CN, code in CS1 only use limited area (94- 96-)? Why doesn't it occupy the all? For compatibility or other reasons?

    A2: See A1.

    Now some reverse questions:

    RQ1: Are ISO-2022-CN and EUC-CN still in use anywhere? I notice that the Python language supplies codecs for ISO-2022-JP (7 different varieties) and ISO-2022-KR but not for ISO-2022-CN (or -CN-EXT). Lunde says that ISO-2022-CN wasn't widely used.

    RQ2: Why have you tagged your question with gb2312 and gbk?

    RQ3: What is your interest in ISO-2022-CN etc?