Search code examples
c++unicodesubscriptturbo-c++superscript

How can I use Unicode in Turbo C++?


How can I use Unicode symbols in Turbo C++?

I particularly want to superscript and subscript symbols.

I must use the outdated Turbo C++ as this is what my school provides and I have to use this for my project.


Solution

  • As stated, Turbo C++ won't get you any straight access to Unicode. It is likely that it is so old that it can't even generate code that could be made to use the system's libraries (DLL), so - even by recreating header files by hand, you could not call wprintf which could output proper Unicode even on the arcane cmd terminal Microsoft ships with Windows to this day.

    However, the default character encoding used in the cmd terminal supports some non-ASCII characters - which exactly will depend on the language (locale) configuration of your OS. (For example, for Western European languages, it is usually "cp-852" - although it can be CP 850, if your Windows is in English.

    None of these legacy 8-bit character map encodings will include all ten digits as super-script - but you might have some available (CP 850 features "¹,²,³", for example).

    So, you could check the terminal code page, and check on Wikipedia for their codes - you can inspect and change the current code page with the chcp command in the Windows terminal. If your Windows version supports UTF-8, which covers all printable Unicode characters, you have to type chcp 65001 in the terminal. (I don't know which Windows editions support that, nor which you are using.)

    Once you manage to do that, all you need is to print the byte-sequences for the super-script digits in UTF-8, using the "\xHH" encoding for characters in a string (I am not sure if Turbo C++ will allow it. Otherwise, `printf ("%c%c", 0xHH, 0xHH) will work.)

    For your convenience, I am attaching the codepoints and UTF-8 encodings for superscripts:

    0x00B2: SUPERSCRIPT TWO - ² - utf-8 seq: b'\xc2\xb2'
    0x00B3: SUPERSCRIPT THREE - ³ - utf-8 seq: b'\xc2\xb3'
    0x00B9: SUPERSCRIPT ONE - ¹ - utf-8 seq: b'\xc2\xb9'
    0x0670: ARABIC LETTER SUPERSCRIPT ALEF - ٰ - utf-8 seq: b'\xd9\xb0'
    0x0711: SYRIAC LETTER SUPERSCRIPT ALAPH - ܑ - utf-8 seq: b'\xdc\x91'
    0x2070: SUPERSCRIPT ZERO - ⁰ - utf-8 seq: b'\xe2\x81\xb0'
    0x2071: SUPERSCRIPT LATIN SMALL LETTER I - ⁱ - utf-8 seq: b'\xe2\x81\xb1'
    0x2074: SUPERSCRIPT FOUR - ⁴ - utf-8 seq: b'\xe2\x81\xb4'
    0x2075: SUPERSCRIPT FIVE - ⁵ - utf-8 seq: b'\xe2\x81\xb5'
    0x2076: SUPERSCRIPT SIX - ⁶ - utf-8 seq: b'\xe2\x81\xb6'
    0x2077: SUPERSCRIPT SEVEN - ⁷ - utf-8 seq: b'\xe2\x81\xb7'
    0x2078: SUPERSCRIPT EIGHT - ⁸ - utf-8 seq: b'\xe2\x81\xb8'
    0x2079: SUPERSCRIPT NINE - ⁹ - utf-8 seq: b'\xe2\x81\xb9'
    0x207A: SUPERSCRIPT PLUS SIGN - ⁺ - utf-8 seq: b'\xe2\x81\xba'
    0x207B: SUPERSCRIPT MINUS - ⁻ - utf-8 seq: b'\xe2\x81\xbb'
    0x207C: SUPERSCRIPT EQUALS SIGN - ⁼ - utf-8 seq: b'\xe2\x81\xbc'
    0x207D: SUPERSCRIPT LEFT PARENTHESIS - ⁽ - utf-8 seq: b'\xe2\x81\xbd'
    0x207E: SUPERSCRIPT RIGHT PARENTHESIS - ⁾ - utf-8 seq: b'\xe2\x81\xbe'
    0x207F: SUPERSCRIPT LATIN SMALL LETTER N - ⁿ - utf-8 seq: b'\xe2\x81\xbf'
    0xFC5B: ARABIC LIGATURE THAL WITH SUPERSCRIPT ALEF ISOLATED FORM - ﱛ - utf-8 seq: b'\xef\xb1\x9b'
    0xFC5C: ARABIC LIGATURE REH WITH SUPERSCRIPT ALEF ISOLATED FORM - ﱜ - utf-8 seq: b'\xef\xb1\x9c'
    0xFC5D: ARABIC LIGATURE ALEF MAKSURA WITH SUPERSCRIPT ALEF ISOLATED FORM - ﱝ - utf-8 seq: b'\xef\xb1\x9d'
    0xFC63: ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM - ﱣ - utf-8 seq: b'\xef\xb1\xa3'
    0xFC90: ARABIC LIGATURE ALEF MAKSURA WITH SUPERSCRIPT ALEF FINAL FORM - ﲐ - utf-8 seq: b'\xef\xb2\x90'
    0xFCD9: ARABIC LIGATURE HEH WITH SUPERSCRIPT ALEF INITIAL FORM - ﳙ - utf-8 seq: b'\xef\xb3\x99'
    
    

    (This was generated with the following Python snippet in interactive mode:)

    import unicodedata
    for i in range(0, 0x10ffff):
        char = chr(i)
        try:
            name = unicodedata.name(char)
        except ValueError:
            pass
        if "SUPERSCRIPT" not in name:
            continue
        print(f"0x{i:04X}: {name} - {char} - utf-8 seq: {char.encode('utf-8')}")