Search code examples
cnon-ascii-charactersascii-artterminal-emulatormplayer

libcaca - changing ascii glyphs to Katakana


I am creating a video effect that is supposed to look as in "Matrix" movie, but a bit different ("Matrix"-like video output will be mixed with an altered alpha channel with real video, so it will look half real, half with digits). I am using simply mplayer with caca driver (mplayer -vo caca video.mp4) together with screen recording and then mixing videos in other software. For this I needed to change "static uint32_t ascii_glyphs[]" array in file dither.c (from the code of the caca library as it published here: https://github.com/cacalabs/libcaca/blob/master/caca/dither.c) from: ' ', '.', ':', ';', 't', '%', 'S', 'X', '@', '8', '?' to contain all Katakana symbols. But the problem is that it looks like they are not printable. So the terminal output of the video contains only shadow blocks. I should say that the bash code:

str123="ァアィイゥウェエォオカガキギクグケゲコゴサザシジスズセゼソゾタダチヂッツヅテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモャヤュユョヨラリルレロヮワヰヱヲンヴヵヶヷヸヹヺヽヾヿㇰㇱㇲㇳㇴㇵㇶㇷㇸㇹㇺㇻㇼㇽㇾㇿ㌀㌁㌂㌃㌄㌅㌆㌇㌈㌉㌊㌋㌌㌍㌎㌏㌐㌑㌒㌓㌔㌕㌖㌗㌘㌙㌚㌛㌜㌝㌞㌟㌠㌡㌢㌣㌤㌥㌦㌧㌨㌩㌪㌫㌭㌮㌯㌰㌱㌲㌳㌴㌵㌶㌷㌸㌹㌺㌻㌼㌽㌾㌿㍀㍁㍂㍃㍄㍅㍆㍇㍈㍉㍊㍋㍌㍍㍎㍏㍐㍑㍒㍓㍔㍕㍖㍗ヲァィゥェォャュョッアイウエオカキクケコサシスセソタチツテトナニヌネノハヒフヘホマミムメモヤユヨラリルレロワン"

for i in $(seq 0 ${#str123}); do echo -n "'${str123:i:1}',"; done

working correctly in my terminal (checked with couple of terminal programs, printing correctly), also locales are set:

$ locale
LANG=en_US.UTF-8
LANGUAGE=en_US
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=en_US.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_ALL=

And the result for the new array:

/* List of glyphs */
static uint32_t ascii_glyphs[] =
{
    /*
    ' ', '.', ':', ';', 't', '%', 'S', 'X', '@', '8', '?'
    */

    /*
    ' ', '!', '"', '#', '$', '%', '&', '\'', '(', ')', '*', '+', ',',
    '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
    ':', ';', '<', '=', '>', '?', '@', 'A', 'B', 'C', 'D', 'E', 'F',
    'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S',
    'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '[', '\\', ']', '^', '_', '`',
    'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
    'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
    '{', '|', '}', '~'
    */


    ' ', '!', '"', '#', '$', '%', '&', '\'', '(', ')', '*', '+', ',',
    '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
    ':', ';', '<', '=', '>', '?','@',
    'ァ','ア','ィ','イ','ゥ','ウ','ェ','エ','ォ','オ','カ','ガ','キ','ギ',
    'ク','グ','ケ','ゲ','コ','ゴ','サ','ザ','シ','ジ','ス','ズ','セ','ゼ',
    'ソ','ゾ','タ','ダ','チ','ヂ','ッ','ツ','ヅ','テ','デ','ト','ド','ナ',
    'ニ','ヌ','ネ','ノ','ハ','バ','パ','ヒ','ビ','ピ','フ','ブ','プ','ヘ',
    'ベ','ペ','ホ','ボ','ポ','マ','ミ','ム','メ','モ','ャ','ヤ','ュ','ユ',
    'ョ','ヨ','ラ','リ','ル','レ','ロ','ヮ','ワ','ヰ','ヱ','ヲ','ン','ヴ',
    'ヵ','ヶ','ヷ','ヸ','ヹ','ヺ','ヽ','ヾ','ヿ','ㇰ','ㇱ','ㇲ','ㇳ','ㇴ',
    'ㇵ','ㇶ','ㇷ','ㇸ','ㇹ','ㇺ','ㇻ','ㇼ','ㇽ','ㇾ','ㇿ','㌀','㌁','㌂',
    '㌃','㌄','㌅','㌆','㌇','㌈','㌉','㌊','㌋','㌌','㌍','㌎','㌏','㌐',
    '㌑','㌒','㌓','㌔','㌕','㌖','㌗','㌘','㌙','㌚','㌛','㌜','㌝','㌞',
    '㌟','㌠','㌡','㌢','㌣','㌤','㌥','㌦','㌧','㌨','㌩','㌪','㌫','㌭',
    '㌮','㌯','㌰','㌱','㌲','㌳','㌴','㌵','㌶','㌷','㌸','㌹','㌺','㌻',
    '㌼','㌽','㌾','㌿','㍀','㍁','㍂','㍃','㍄','㍅','㍆','㍇','㍈','㍉',
    '㍊','㍋','㍌','㍍','㍎','㍏','㍐','㍑','㍒','㍓','㍔','㍕','㍖','㍗',
    '[', '\\', ']', '^', '_', '`',
    'ヲ','ァ','ィ','ゥ','ェ','ォ','ャ','ュ','ョ','ッ','ア','イ','ウ','エ','オ','カ','キ','ク',
    'ケ','コ','サ','シ','ス','セ','ソ','タ','チ','ツ','テ','ト','ナ','ニ','ヌ','ネ','ノ','ハ',
    'ヒ','フ','ヘ','ホ','マ','ミ','ム','メ','モ','ヤ','ユ','ヨ','ラ','リ','ル','レ','ロ','ワ',
    'ン',
    '{', '|', '}', '~'

};

is this: Katakana

For example, if I change this "static uint32_t ascii_glyphs[]" array to contain full ascii set, then the result is: Full ascii

Update: I tried to change "static uint32_t ascii_glyphs[]" array to contain Katakana glyphs in Hexadecimal representation, still no result, but (!) if I add these multibyte characters to set:

static uint32_t ascii_glyphs[] =
{
    /* CP437 and box drawing */
    0x2591, 0x2592, 0x2593, 0x2588, 0x2584, 0x2580, /* ░ ▒ ▓ █ ▄ ▀ */
    0x2500, 0x2501, 0x2503, 0x2503, 0x253c, 0x254b, /* ─ ━ │ ┃ ┼ ╋ */
    0x252c, 0x2534, 0x2533, 0x253b, 0x2566, 0x2569, /* ┬ ┴ ┳ ┻ ╦ ╩ */
    0x2550, 0x2551, 0x256c, /* ═ ║ ╬ */
    0x2575, 0x2577, 0x2579, 0x257b
};

so those characters are printed correctly. Result: Result But if I add Katakana in Hexadecimal:

static uint32_t ascii_glyphs[] =
{
    /* CP437 and box drawing */
    0x2591, 0x2592, 0x2593, 0x2588, 0x2584, 0x2580, /* ░ ▒ ▓ █ ▄ ▀ */
    0x2500, 0x2501, 0x2503, 0x2503, 0x253c, 0x254b, /* ─ ━ │ ┃ ┼ ╋ */
    0x252c, 0x2534, 0x2533, 0x253b, 0x2566, 0x2569, /* ┬ ┴ ┳ ┻ ╦ ╩ */
    0x2550, 0x2551, 0x256c, /* ═ ║ ╬ */
    0x2575, 0x2577, 0x2579, 0x257b,

    /* Katakana (part) */
    0x30a1,0x30a2,0x30a3,0x30a4,0x30a5,0x30a6,0x30a7,0x30a8,0x30a9,0x30aa,
    0x30ab,0x30ac,0x30ad,0x30ae,0x30af,0x30b0,0x30b1,0x30b2,0x30b3,0x30b4
}; 

so many blanks (just background and shades chars, without glyphs) are added:

Many blanks

So why this is still not working? Looks like somehow the terminal (?), gcc (?) or something on the way just not liking Katakana symbols :)

Thank you for your guidance!


Solution

  • The problem is that hiragana and katakana are fullwidth characters. When Caca tries to write a character to the screen using caca_put_char(), it checks if there is already a fullwidth character on the screen, and if so, it will replace part of it with a space. Since all possible character positions on the screen are written to, it ends up overwriting any fullwidth character with a space, and thus in the end no katakana will be visible.

    I think you would have to modify Caca to handle fullwidth characters in the dither character set. If all characters are fullwidth, it should just write only to even columns on the screen. If you have a mix, it will be more complex, but you could for example make it so that if there is already a fullwidth character on a given position, it will just not try to overwrite it.