[Please note I am using _XOPEN_SOURCE_EXTENDED 1
and setlocale(LC_CTYPE, "")
.]
Curses includes various functions for extracting characters from the screen; they can be divided into those which grab just the text and those which grab the text plus attributes (bold, color, etc.). The former use wchar_t
(or char
) and the latter curses' own chtype
.
There are constants to mask a chtype
to get just the character or just the attributes -- A_CHARTEXT
and A_ATTRIBUTES
. However, from the value of these, it is easy to see that there will be collisions with wchar_t
values over 255. A_ATTRIBUTES
is 64-bits and only the lower 8 are unset.
If the base type internally is chtype
, this would mean ncurses was unworkable with most of unicode, but it isn't -- you can use hardcoded strings in UTF-8 source and write them out with attributes no problem. Where it gets interesting is getting them back again.
wchar_t s[] = "\412";
This character has a value of 266 and displays as Ċ
. However, when extracted into a chtype
using, e.g., mvwinchnstr()
, it is exactly the same as a space (10) with the COLOR_PAIR(1)
attribute (256) set. And in fact, if you take the extracted chtype
and redisplay it, you get just that -- a space with COLOR_PAIR(1)
set.
But if you extract it instead into a wchar_t
with, e.g. mvwinnwstr()
, it's correct, as is a colored space. The problem with this, of course, is that the attributes are gone. This implies the attributes are being masked out correctly, which is demonstrably impossible with a chtype
, since a chtype
for both of these has the same value (266). In other words, the internal representation is obviously niether a chtype
nor a wchar_t
.
I do not use ncurses much, and I notice there are other curses implementations (e.g. Oracle's) with functions that imply the chtype
there might not have this problem. In any case, is there a way w/ ncurses to unambiguously extract wide chars together with their attributes?
[I've tagged this C and C++ since it is applicable in both contexts.]
It is more complicated than that. But briefly:
chtype
.cchar_t
.chtype
and cchar_t
were not envisioned as possibly different views of the same data. You can only make 8-bit encodings with the former.addstr
(none of the Unix's do).chtype
corresponds to a single cell on the screen, and can hold only an 8-bit character. Interfaces such as winnstr
which return a string will work within that constraint. The winchnstr
function does return an array of chtype
values.win_wchnstr