Search code examples
delphidelphi-2010

Can't work with UTF-8 encoding


I load a text file using this code (my file encoding is UTF-8) (How to read a text file that contains 'NULL CHARACTER' in Delphi?):

uses
IOUtils;

var
  s: string;
  ss: TStringStream;
begin
  s := TFile.ReadAllText('c:\MyFile.txt');
  s := StringReplace(s, #0, '', [rfReplaceAll]);  //Removes NULL CHARS
  ss := TStringStream.Create(s);

  try
    RichEdit1.Lines.LoadFromStream(ss, TEncoding.UTF8); //UTF8
  finally
    ss.Free;
  end;

end;

But my problem is that the RichEdit1 doesn't load the whole text. It's not because of Null Characters. It's because of the encoding. When I run the application with this code, It loads the whole text:

uses
IOUtils;

var
  s: string;
  ss: TStringStream;
begin
  s := TFile.ReadAllText('c:\MyFile.txt');
  s := StringReplace(s, #0, '', [rfReplaceAll]);  //Removes NULL CHARS
  ss := TStringStream.Create(s);

  try
    RichEdit1.Lines.LoadFromStream(ss, TEncoding.Default);
  finally
    ss.Free;
  end;

end;

I changed TEncoding.UTF8 to TEncoding.Default. The whole text loaded but it's not in right format and it's not readable.

I guess there are some characters that UTF 8 doesn't support. So the loading process stops when it want to load that char.

Please Help. Any workarounds?

****EDIT:**

I'm sure its UTF-8 and it plain text. It's a HTML source file. I'm sure it has null charas I saw them using Notepad++ And the value of the Richedit.Plainext is true


Solution

  • You should give the encoding to TFile.ReadAllText. After that you are working with Unicode strings only and don't have to bother with UTF8 in the RichEdit.

    var
      s: string;
    begin
      s := TFile.ReadAllText('c:\MyFile.txt', TEncoding.UTF8);
      // normally this shouldn't be necessary 
      s := StringReplace(s, #0, '', [rfReplaceAll]);  //Removes NULL CHARS
      RichEdit1.Lines.Text := s;
    
    end;