Search code examples
delphifiremonkey

Decode UTF-8 from JSON file


I have a JSON file with a encoded UTF-8 string field that represents a JPG content:

"ImageData": "ÿØÿà\u0000\u0010JFIF\u0000\u0001\u0002\u0000\u0000d\u0000d\u0000\u0000

I am parsing the JSON and getting that value:

var imageString : string;
...
imageString:=jv.GetValue<string>('ImageData');

But I am having issues while decoding the bytes and save them to a file

Option 1. SaveBytesToFile(BytesOf(imageString),pathFile);

As you can see, the header is not correct (should start with ÿØÿà)

option1

Option 2. SaveBytesToFile(TEncoding.UTF8.GetBytes(imageString),pathFile);

Similar issue as option 1

option2

Code for SaveBytesToFile:

procedure SaveBytesToFile(const Data: TBytes; const FileName: string);
var
  stream: TMemoryStream;
begin
  stream := TMemoryStream.Create;
  try
    if length(data) > 0 then
      stream.WriteBuffer(data[0], length(data));
    stream.SaveToFile(FileName);
  finally
    stream.Free;
  end;
end;

How I can decode it properly?


Solution

  • JSON is a text-only format, it has no provisions for handling binary data at all. Why are the image bytes not being encoded in a text-compatible format, like base64, base85, base91, etc? Otherwise, use something like BSON (Binary JSON) or UBJSON (Universal Binary JSON) instead, which both support binary data.

    In any case, BytesOf() will corrupt bytes, since it uses the user's default locale (via TEncoding.Default, which is UTF-8 on non-Windows platforms!), so characters outside of the ASCII range are subject to locale interpretation and won't produce the bytes you need.

    In your situation, make sure the JSON library is decoding the JSON file as UTF-8, then you can simply loop through the resulting string (the JSON library should be parsing the escaped sequences into characters for you) and truncate the characters as-is to 8-bit values. Don't perform any kind of charset conversion at all. For example:

    var
      imageString : string;
      imageBytes: TBytes;
      i: Integer;
      ...
    begin
      ...
    
      imageString := jv.GetValue<string>('ImageData');
    
      SetLength(imageBytes, Length(imageString));
      for i := 0 to Length(imageString)-1 do begin
        imageBytes[i] := Byte(imageString[i+1]);
      end;
    
      SaveBytesToFile(imageBytes, pathFile);
    
      ...
    end;
    

    image

    On a side note, your SaveBytesToFile() can be greatly simplified without wasting memory making a copy of the TBytes:

    procedure SaveBytesToFile(const Data: TBytes; const FileName: string);
    var
      stream: TBytesStream;
    begin
      stream := TBytesStream.Create(Data);
      try
        stream.SaveToFile(FileName);
      finally
        stream.Free;
      end;
    end;
    

    Or:

    procedure SaveBytesToFile(const Data: TBytes; const FileName: string);
    var
      stream: TFileStream;
    begin
      stream := TFileStream.Create(FileName, fmCreate);
      try
        stream.WriteBuffer(PByte(Data)^, Length(Data));
      finally
        stream.Free;
      end;
    end;
    

    Or:

    uses
      ..., System.IOUtils;
    
    procedure SaveBytesToFile(const Data: TBytes; const FileName: string);
    begin
      System.IOUtils.TFile.WriteAllBytes(FileName, Data);
    end;