Search code examples
python-3.xunicodecursespython-curses

python curses addstr y-offset: strange behavior with unicode


I am having trouble with python3 curses and unicode:

#!/usr/bin/env python3
import curses
import locale

locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')

def doStuff(stdscr):
  offset = 3
  stdscr.addstr(0, 0, "わたし")
  stdscr.addstr(0, offset, 'hello', curses.A_BOLD)
  stdscr.getch() # pauses until a key's hit

curses.wrapper(doStuff)

I can display unicode characters just fine, but the y-offset argument to addstr ("offset" in my code) is not acting as expected; my screen displays "わたhello" instead of "わたしhello"

In fact the offset has very strange behavior:

- 0:hello
- 1:わhello
- 2:わhello
- 3:わたhello
- 4:わたhello
- 5:わたしhello
- 6:わたしhello
- 7:わたし hello
- 8:わたし  hello
- 9:わたし   hello

Note that the offset is not in bytes, since the characters are 3-byte unicode characters:

>>>len("わ".encode('utf-8'))
3
>>> len("わ")
1

I'm running python 4.8.3 and curses.version is "b'2.2'".

Does anyone know what's going on or how to debug this? Thanks in advance.


Solution

  • You're printing 3 double-width characters. That is, each of those takes up two cells.

    The length of the string in characters (or bytes) is not necessarily the same as the number of cells used for each character.

    Python curses is just a thin layer over ncurses.

    I'd expect the characters in lines 1,3,5 to be erased by putting a character onto the second cell of those double-width characters (ncurses is supposed to do this...), but that detail could be a bug in the terminal emulator).