I've got some wrongly decoded text fragment. It was decoded like cp866, but in fact it should be utf-8 ("нажал кабан на баклажан"
--> "╨╜╨░╨╢╨░╨╗ ╨║╨░╨▒╨░╨╜ ╨╜╨░ ╨▒╨░╨║╨╗╨░╨╢╨░╨╜"
). I'd like to fix it, and I've already written the code in Python which solves the task:
broken = "╨╜╨░╨╢╨░╨╗ ╨║╨░╨▒╨░╨╜ ╨╜╨░ ╨▒╨░╨║╨╗╨░╨╢╨░╨╜"
fixed = bytes(broken, 'cp866').decode('utf-8')
print(fixed) # it will print 'нажал кабан на баклажан'
However, at first I was trying to solve this issue in D, but failed to find an answer. So, how can this task be solved in D?
At the moment, D does not have extensive native facilities for converting text between encodings.
Here are some options:
std.windows.charset.fromMBSz
and toMBSz
, which wrap MultiByteToWideChar
and WideCharToMultiByte
.iconv
program (example), or use the libiconv
library (D1 binding).