Search code examples
c++clang-format

How to configure clang-format to take UTF-8 character's length as shown rather than its real bytes length?


When I input some UTF-8 character rather than simple ASCII, I find that clang-format will take the character's length as how many bytes expressed, rather than as the length show in terminal.

For example:

is stored in 3 bytes, but it takes 2 ASCII space in vim or other editors.

The expected formatted code should be as follow:

#define test   \
  /* 测试 */   \
  "aa"         \
  "bb"         \
  "bb"

But what I really get is as follow:

#define test   \
  /* 测试 */ \
  "aa"         \
  "bb"         \
  "bb"

How can I get the expected result with some configuration?


Solution

  • clang-format already detects UTF-8. It doesn't need to be told to do it. So you're dealing with a bug here that you should report to their bug tracker:

    https://bugs.llvm.org/

    Make sure you have tested with the latest clang-format version though. 11.0.0 as of right now. You don't want to report a bug from an old version that has been fixed already.