Search code examples
arraysdelphiunicodenormalizationunicode-normalization

Change special characters in array Delphi


Some string that I am getting is UTF-8 encoded, and contains some special characters like Å¡, Ä‘, Ä etc. I am using StringReplace() to convert it to some normal text, but I can only convert one type of character. Because PHP also has a function to replace strings as seen here: how to replace special characters with the ones they're based on in PHP?, but it supports arrays:

<?php
  $vOriginalString = "¿Dónde está el niño que vive aquí? En el témpano o en el iglú. ÁFRICA, MÉXICO, ÍNDICE, CANCIÓN y NÚMERO.";

  $vSomeSpecialChars = array("á", "é", "í", "ó", "ú", "Á", "É", "Í", "Ó", "Ú", "ñ", "Ñ");
  $vReplacementChars = array("a", "e", "i", "o", "u", "A", "E", "I", "O", "U", "n", "N");

  $vReplacedString = str_replace($vSomeSpecialChars, $vReplacementChars, $vOriginalString);

  echo $vReplacedString; // outputs '¿Donde esta el nino que vive aqui? En el tempano o en el iglu. AFRICA, MEXICO, INDICE, CANCION y NUMERO.'
?>

How can I do this in Delphi? StringReplace doesn't support arrays.


Solution

  • function str_replace(const oldChars, newChars: array of Char; const str: string): string;
    var
      i: Integer;
    begin
      Assert(Length(oldChars)=Length(newChars));
      Result := str;
      for i := 0 to high(oldChars) do
        Result := StringReplace(Result, oldChars[i], newChars[i], [rfReplaceAll])
    end;
    

    If you are concerned about all the needless heap allocations caused by StringReplace then you could write it this way:

    function str_replace(const oldChars, newChars: array of Char; const str: string): string;
    var
      i, j: Integer;
    begin
      Assert(Length(oldChars)=Length(newChars));
      Result := str;
      for i := 1 to Length(Result) do
        for j := 0 to high(oldChars) do
          if Result[i]=oldChars[j] then
          begin
            Result[i] := newChars[j];
            break;
          end;
    end;
    

    Call it like this:

    newStr := str_replace(
      ['á','é','í'],
      ['a','e','i'], 
      oldStr
    );