Search code examples
xmldelphiformattingxml-serializationstandards

Writing XML String with proper formatting?


Please pardon my lack of proper terminology, as I'm sure there's a term for this. I'm writing XML text using raw strings (not with any type of XML builder/parser, for ease of use). However, I'm facing an issue where some characters in the data I'm providing throw off the standardization. For example, the & symbol. When a string includes this, the end parser gets thrown off. How do I accommodate for this properly and convert strings to XML standards?

I'm writing plain strings to a string list and reading its Text property like below. Note the subroutine A(const S: String); which is a shortened method of adding a line to the XML file and adds a necessary indent. See the subroutine Standardize, this is what I need to fill in.

uses Windows, Classes, SysUtils, DB, ADODB, ActiveX;

function TSomething.FetchXML(const SQL: String): String;
var
  L: TStringList;
  Q: TADOQuery;
  X, Y: Integer;
  function Standardize(const S: String): String;
  begin
    Result:= S; //<<<--- Need to convert string to XML standards
  end;
  procedure A(const Text: String; const Indent: Integer = 0);
  var
    I: Integer;
    S: String;
  begin
    if Indent > 0 then
      for I := 0 to Indent do
        S:= S + '  ';
    L.Append(S + Text);
  end;
begin
  Result:= '';
  L:= TStringList.Create;
  try
    Q:= TADOQuery.Create(nil);
    try
      Q.ConnectionString:= FCredentials.ConnectionString;
      Q.SQL.Text:= SQL;
      Q.Open;
      A('<?xml version="1.0" encoding="UTF-8"?>');
      A('<dataset Source="ECatAPI">');
      A('<table>');
      A('<fields>', 1);
      for X := 0 to Q.FieldCount - 1 do begin
        A('<field Name="'+Q.Fields[X].FieldName+'" '+
          'Type="'+IntToStr(Integer(Q.Fields[X].DataType))+'" '+
          'Width="'+IntToStr(Q.Fields[X].DisplayWidth)+'" />', 2);
      end;
      A('</fields>', 1);
      A('<rows>', 1);
      if not Q.IsEmpty then begin
        Q.First;
        while not Q.Eof do begin
          A('<row>', 2);
          for Y:= 0 to Q.FieldCount - 1 do begin
            A('<value Field="'+Q.Fields[Y].FieldName+'">'+
              Standardize(Q.Fields[Y].AsString)+'</value>', 3);
          end;
          A('</row>', 2);
          Q.Next;
        end;
      end;
      A('</rows>', 1);
      A('</table>');
      A('</dataset>');
      Result:= L.Text;
      Q.Close;
    finally
      Q.Free;
    end;
  finally
    L.Free;
  end;
end;

NOTE

The above is pseudo-code, copied and modified, irrelevant things have been altered/excluded...

MORE INFO

This application is a stand-alone web server providing read-only access to data. I only need to write XML data, I don't need to read it. And even if I do, I have an XML parser library covering that part already. I'm trying to keep this light-weight as possible, without filling the memory with unnecessary objects.


Solution

  • Thanks to the comments above in the question, I've implemented a function to replace predefined entities with the appropriate name. This is the new subroutine:

    function EncodeXmlStr(const S: String): String;
    begin
      Result:= StringReplace(S,      '&',  '&amp;',  [rfReplaceAll]);
      Result:= StringReplace(Result, '''', '&apos;', [rfReplaceAll]);
      Result:= StringReplace(Result, '"',  '&quot;', [rfReplaceAll]);
      Result:= StringReplace(Result, '<',  '&lt;',   [rfReplaceAll]);
      Result:= StringReplace(Result, '>',  '&gt;',   [rfReplaceAll]);
    end;