Search code examples
delphiutf-8base64delphi-7

Create Base64 string from Hebrew text in Delphi


I am trying to encode the 'subject' field, written in Hebrew, of an email into Base64 so that the subject can be read correctly in all browsers. At the moment, I am using the encoding Windows-1255 which works on some clients but not all, so I want to use utf-8, base64.

My reading on the subject (no pun intended) shows that the text has to be in the form

=?<charset>?<encoding>?<encoded text>?=

eg

=?windows-1255?Q?=E0=E1?=

I have taken encoded subject lines from letters which were sent to me in Hebrew with UTF-8B encoding and decoded them successfully on this website, www.webatic.com/run/convert/base64.php. I have also used this website to encode simple letters and have noticed that the return encoding is not the same as the result which I get from a Delphi algorithm.

So - I am looking for an algorithm which successfully encodes letters such as aleph (ord=224), bet (ord=225), etc. According to the website, the string composed of the two letters aleph and bet returns the code 15DXkq==, but the basic Delphi algorithm returns Ue4 and the TIdEncoderQuotedPrintable component returns =E0=E1 (which is the ISO-8859 encoding).

Edit (after several comments):

I asked a friend to send me an email from her Mac computer, which unsurprisingly uses UTF-8 encoding (as opposed to Windows-1255). The subject was one letter, aleph, ord 224. The encoded subject appeared in the email's header as follows

=?UTF-8?B?15A=?=

This can be separated into three parts: the 'prefix' (=?UTF-8?B?) which means that UTF-8 with base64 encoding is being used; the 'payload' (15A=), which the web site which I quoted translates this correctly as the letter aleph; and the suffix (?=).

I need an algorithm to translate an arbitrary string of letters, most of which will be in Hebrew (and thus with ord >= 224) into base64/utf-8; a correct solution is one that decodes correctly on the web site quoted.


Solution

  • You do not need to encode the Subject property manually at all. TIdMessage encodes it automatically for you. Simply assign the Edit1.Text value as-is to the Subject and let TIdMessage encode it as needed.

    If you want to customize how TIdMessage encodes headers, use the TIdMessage.OnInitializeISO event to provide the desired charset and encoding values. In Delphi 2009+, it defaults to UTF-8 and Base64. In earlier versions, TIdMessage reads the RTL's current OS language and chooses some default values for known languages. However, Hebrew is not one of them, and so ISO-8859-1 and QuotedPrintable would end up being used. You can override those values, eg:

    email.Subject := Edit1.Text;
    

    .

    procedure TForm1.emailInitializeISO(var VHeaderEncoding: Char; var VCharSet: string);
    begin
      VHeaderEncoding := 'B';
      VCharSet := 'UTF-8';
    end;