How can I allocate a UTF8 String on a stack/heap? Here is an example which uses a static array to allocate it. However the array is full of "?" in the debugger. Do I need to factor in codepage also while allocating?
program Project1;
procedure Main;
var
Stack: Array[0..20] of AnsiChar;
Heap: PAnsiChar;
begin
Stack := '漢語漢語漢語漢語';
GetMem(Heap, 8 * SizeOf(AnsiChar));
Move(PAnsiChar('漢語漢語漢語漢語')^, Heap^, 8 * SizeOf(AnsiChar));
end;
begin
Main;
end.
On the other hand this works fine.
program Project1;
procedure Main;
var
S: UTF8String;
begin
S := '漢語漢語漢語漢語';
end;
begin
Main;
end.
You cannot persuade the compiler to produce a UTF-8 encoded constant. It will provide either ANSI or UTF-16, but not UTF-8. You'll have to handle the encoding yourself.
That could look like this:
procedure Main;
const
utf8string: PAnsiChar =
#$E6#$BC#$A2#$E8#$AA#$9E#$E6#$BC#$A2#$E8#$AA#$9E +
#$E6#$BC#$A2#$E8#$AA#$9E#$E6#$BC#$A2#$E8#$AA#$9E +
#$00;
var
Stack: array [0..24] of AnsiChar;
begin
Move(Pointer(utf8string)^, Stack, SizeOf(Stack));
end;
Actually, it turns out I was wrong. You can persuade the compiler to UTF-8 encode constants. Like this:
procedure Main;
const
utf8str: UTF8String = '漢語漢語漢語漢語';
var
Stack: array [0..24] of AnsiChar;
begin
Assert(Length(utf8str) + 1 = Length(Stack));
Move(Pointer(utf8str)^, Stack, SizeOf(Stack));
end;
Note that your array was too short for the text, once it has been UTF-8 encoded.
You already know how to allocate memory on the heap, so I don't need to explain that.