I want to print the following ASCII characters using the C programming language on Windows OS:
#include <stdio.h>
int main() {
for(unsigned int ch=128 ; ch < 256 ; ch++)
{
printf("%d = %c\t\t", ch, ch);
}
return 0;
}
In the output, I see that the characters are not showing up:
Output
This is probably encoding related issue.
How can I decode these characters so that they show up properly?
As some commenters have mentionned, this is not ASCII.
The ASCII table stops at 127. It does not go further. However, some other tables like the Unicode table have their entries from 0 to 127 to same as ASCII, but add more characters after that.
So to give you a clear answer on "How can I print ASCII characters from 128 to 255?", you can't, as there are no ASCII characters with a value above 127.
Now, let's dive into that table you have sent.
During the DOS era, Before Unicode launched in 1991, Microsoft came up with what is known as Code Pages
.
Unlike what seems to be a quite popular belief, "Extended ASCII" is not an extension for the ASCII table, but a FAMILLY of extensions for the ASCII table.
Code Pages are ASCII extensions, and so is Unicode. They are character tables that start with ASCII and then add their own entries.
Back on the history lesson, different iterations of Microsoft Code Pages exist. If you open the Command Prompt (cmd) on Windows and enter the command chcp
, it will print the current encoding that it is in. By default, on Windows 10 and 11, it is Code Page 850.
The ASCII extension table you have sent is Code Page 437, as pointed out by commenter @ikegami.
Unlike Unicode, Code Pages aren't universal, they are Windows-only. Modern systems use UTF-8, which is a way to encode Unicode characters in a way that minimizes the required amount of bytes for your text.
I'll avoid giving you false hopes right away: there are no universal way to make work the encoding you are asking for. You can get it to work on Windows only, and only if you use Windows' Command Prompt (or PowerShell).
In the screenshot you sent, you are clearly using a Linux-like terminal or shell. You will never be able to get the ASCII extension you asked for in this kind of thing.
By entering the chcp 437
command, it will change your Command Prompt's encoding to Code Page 437.
Do note that this is temporary though, if you open a new CMD window, it'll be back to the default encoding, which is Code Page 850 as mentionned above.
Then, if we execute the sample C code you have embedded, here we are:
But it certainly will be annoying to tell people you might distribute your code to: "Hey, by the way, open the Windows CMD and type chcp 437
, and only then run my tool using that exact CMD instance, as using another one won't work."
Therefore, you can use the Windows API to automatically change that:
#include <stdio.h>
#include <windows.h>
int main() {
SetConsoleOutputCP(437);
for(unsigned int ch = 128; ch < 256; ch++)
{
printf("%d = %c\t\t", ch, ch);
}
return 0;
}
Optionally, you can use #ifdef
directives to ensure that the Windows API-related lines will only be compiled if on Windows.
Once compiled, as you can see, even if I set the CMD's encoding to Code Page 850, the characters from Code Page 437 will show up:
This solution however is just... bad. If you plan to make your tool Windows-only, then I guess it is fine, but it is generally a bad practice to use non-universal character sets and encodings.
Besides, as mentionned earlier, this trickery will only work if you use Windows' CMD (or the Windows PowerShell, that works too).
If you use a terminal like you do in the screenshot that you have sent, it will not work, because all terminals and shells use UTF-8.
UTF-8 is a way to encode Unicode characters, and best of all, it is universal.
All terminals use UTF-8, but Windows' CMD also does if you ask it to.
Using the Windows API, you can change the CMD or PowerShell encoding to UTF-8:
#include <windows.h>
int main() {
SetConsoleOutputCP(65001);
// ...
return 0;
}
Optionally, you can use #ifdef
directives to ensure that the Windows API-related lines will only be compiled if on Windows.
In addition, when you write raw strings in your C code, they are encoded in UTF-8 by your compiler no matter what. This makes the following possible:
#include <stdio.h>
#include <windows.h>
int main() {
SetConsoleOutputCP(65001);
printf("Here is an accented character: é\n");
return 0;
}
See? I have written a character that is not part of the ASCII table directly in my code, and it printed successfully.
You will of course not have the values you asked for in your question, but adapting your character codes is a low price to pay to be able to make your code entirely universal.
Besides, it is your only solution if you want to use a terminal/shell like you are.
If you really, absolutely need the character table that you have put in your question:
#include <stdio.h>
#include <windows.h>
int main() {
SetConsoleOutputCP(437);
for(unsigned int ch = 128; ch < 256; ch++)
{
printf("%d = %c\t\t", ch, ch);
}
return 0;
}
If you don't mind as long as you can use any character that isn't in the regular ASCII table, then move to UTF-8, which is universal and much more practical than what you want to use:
#include <stdio.h>
#include <windows.h>
int main() {
SetConsoleOutputCP(65001);
for(unsigned int ch = 128; ch < 256; ch++)
{
printf("%d = %c\t\t", ch, ch);
}
return 0;
}
You can of course use #ifdef
directives to do the Windows-only stuff only when compiling for Windows.