Search code examples
ms-accessunicodevba

MS Access VBA - how to read utf8 text independantly from Windows system locale


I have a MySQL table with a text field. It contains a hyperlink, and it is encoded in utf8 (utf8-unicode-ci collation). I want to open the hyperlink programmatically from VBA.

The text field may contain characters like "őűö", which are not present in western European codepage (1252), but available in central European (1250).

My first attempt was to run a pass-trough query, read the field value into a VBA string, and open it with Application.Followhyperlink. It works, when windows system locale - default codepage for non-Unicode compatible applications in regional settings - is Hungarian (uses codepage 1250), and fails, when the system locale is German (uses codepage 1252). The VBA string contains a value converted to the codepage specified by the system locale. So "C:\tükörtűrő" will be read as "C:\tukorturo".

I am not allowed to fix the system locale on 100+ computers. So, how to do it right?


Edit:
Lessons learned:

  • Debug.Print doesn't support Unicode – as stated by Erik von Asmuth. The displayed text in the debug window is misleading.
  • Application.FollowHyperlink can handle Unicode.
  • The real problem was a link health check right before opening the link, where I have used the built in GetAttr(), which depends on system locale settings. I have replaced it with GetFileAttributesW(), everything seems to work now. Some credit goes here to Bonnie West. (https://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=74264&lngWId=1)

Solution

  • VBA and Access use UTF-16 internally, not the system codepage, so this shouldn't be a problem at all. Pass-through queries should just work. However:

    • You need to use the MySQL Unicode driver, not the MySQL ANSI driver

    • Not all VBA functions support unicode characters. For example, MsgBox is ANSI only, and will cast unavailable characters to either questionmarks or the closest equivalent ANSI character.

    • The VBA code itself is not unicode. You can see this answer for an approach to set strings to characters that are unavailable in the codepage used by VBA.