With Free Pascal 3.0.4, this test program correctly writes ÄÖÜ
program FPCTest;
uses IdURI;
begin
WriteLn(TIdURI.URLDecode('%C3%84%C3%96%C3%9C'));
ReadLn;
end.
However if the unit LazUTF8 (as described here) is used, it writes ???
program FPCTest;
uses IdURI, LazUTF8;
begin
WriteLn(TIdURI.URLDecode('%C3%84%C3%96%C3%9C'));
ReadLn;
end.
How can I fix this decoding error for programs which use LazUTF8?
When the String
type is an alias for AnsiString
1, much of Indy's functionality exposes extra parameters/properties to let users control which ANSI encodings are used when AnsiString
values are passed around in operations that perform AnsiString<->byte
conversions.
1: Delphi pre-2009, and FreePascal/Lazarus when {$ModeSwitch UnicodeStrings}
and {$Mode DelphiUnicode}
are not used (FYI, Indy 11 will use them!).
In most cases, Indy's default byte encoding is ASCII (because many of the Internet protocols that Indy implements originally supported only ASCII - individual Indy components upgrade themselves to UTF as appropriate per protocol), though some things use the OS default codepage/charset instead.
Indy's default byte encoding can be changed at runtime by setting the global GIdDefaultTextEncoding
variable in the IdGlobal
unit, eg:
GIdDefaultTextEncoding := encUTF8;
But, in this particular situation, TIdURI.URLEncode()
does not use GIdDefaultTextEncoding
, but it does have an optional ADestEncoding
parameter that you can use to specify a specific byte encoding for the returned AnsiString
(in addition to an optional AByteEncoding
parameter to specify the byte encoding of the parsed url octets - UTF-8 by default), eg:
TIdURI.URLDecode('%C3%84%C3%96%C3%9C'
{$IFNDEF FPC_UNICODESTRINGS}, IndyTextEncoding_UTF8, IndyTextEncoding_UTF8{$ENDIF}
)
The above will parse the url-encoded octets as UTF-8, and then return that data as-is in a UTF-8 encoded AnsiString
.
If you do not specify an output encoding for ADestEncoding
, URLDecode()
defaults to the OS default. If you want it to use GIdDefaultTextEncoding
instead, specify IndyTextEncoding_Default
in the ADestEncoding
parameter:
TIdURI.URLDecode('%C3%84%C3%96%C3%9C'
{$IFNDEF FPC_UNICODESTRINGS}, IndyTextEncoding_UTF8, IndyTextEncoding_Default{$ENDIF}
)
Another option would be to use the IndyTextEncoding(CodePage)
function for ADestEncoding
, passing it FreePascal's DefaultSystemCodePage
variable, which the LazUtils
package sets to CP_UTF8
2:
TIdURI.URLDecode('%C3%84%C3%96%C3%9C'
{$IFNDEF FPC_UNICODESTRINGS}, IndyTextEncoding_UTF8, IndyTextEncoding(DefaultSystemCodePage){$ENDIF}
)
2: I have opened a ticket in Indy's issue tracker to add support for DefaultSystemCodePage
when compiling for FreePascal/Lazarus.