Search code examples
vba

How can convert CJK's character in utf8 format with vba?


It is simple to convert CJK's character in utf8 format with python:

print("你".encode("utf8"))
b'\xe4\xbd\xa0'

The character (means you in English) ,whose utf8 encoding is \xe4\xbd\xa0,How can get the result with VBA?


Solution

  • Using ADODB.Stream in VBA to convert CJK characters to UTF-8.

    Option Explicit
    Sub ConvertCJKToUTF8()
        Dim cjkChar As String
        Dim utf8Bytes() As Byte
        Dim stream As Object
        ' The CJK character to convert
        ' cjkChar = "你"
        ' If user can't input CJK char in VBE
        cjkChar = ChrW(20320)
        ' Create ADODB.Stream object
        Set stream = CreateObject("ADODB.Stream")
        ' Initialize the stream
        With stream
            .Type = 2 ' Text mode
            .Charset = "utf-8" ' Set the charset to UTF-8
            .Open
            .WriteText cjkChar
            .Position = 0
            .Type = 1 ' Binary mode
            utf8Bytes = .Read
            .Close
        End With
        ' Print the UTF-8 bytes to the Immediate Window (Ctrl + G)
        Debug.Print "UTF-8 Bytes:"
        Dim i As Long
        For i = LBound(utf8Bytes) + 3 To UBound(utf8Bytes) ' skip header bytes
            Debug.Print "0x" & Hex(utf8Bytes(i))
        Next i
        Set stream = Nothing
    End Sub
    
    

    Output in Immediate Window:

    UTF-8 Bytes:
    0xE4
    0xBD
    0xA0