Search code examples
delphidelphi-2007tstringlist

TStringlist not loading Google Contacts file


I'm trying to use a Stringlist to load a CSV file generated by Google Contacts. When i open this file in an text editor like Sublime Text, i can see the contents properly, with 75 lines. This is a sample from the Google Contacts file :

Name,Given Name,Additional Name,Family Name,Yomi Name,Given Name Yomi,Additional Name Yomi,Family Name Yomi,Name Prefix,Name Suffix,Initials,Nickname,Short Name,Maiden Name,Birthday,Gender,Location,Billing Information,Directory Server,Mileage,Occupation,Hobby,Sensitivity,Priority,Subject,Notes,Group Membership,Phone 1 - Type,Phone 1 - Value,Phone 2 - Type,Phone 2 - Value,Phone 3 - Type,Phone 3 - Value
H,H,,,,,,,,,,,,,   1-01-01,,,,,,,,,,,,* My Contacts ::: Importado 01/02/16,,,,,,
H - ?,H,-,?,,,,,,,,,,,   1-01-01,,,,,,,,,,,,* My Contacts ::: Importado 01/02/16,Mobile,031-863-64393,,,,
H - ?,H,-,?,,,,,,,,,,,,,,,,,,,,,,,* My Contacts ::: Importado 01/02/16,Mobile,031-986-364393,,,,

BUT when i try to load this same file using Stringlist, this is what i see in the Stringlist.text property :

'ÿþN'#$D#$A

Here is my code :

procedure Tform1.loadfile;
var sl : tstringlist;
begin
sl := tstringlist.create;
sl.loadfromfile('c:\google.csv');
showmessage('lines : '+inttostr(sl.count)+' / text : '+ sl.text);
end;

This is the result i get :

'1 / 'ÿþN'#$D#$A'

What is happening here ?

Thanks


Solution

  • According to the hex dump you provided, the BOM indicates that your file is encoded using UTF-16LE. You a few options in front of you, as I see it:

    1. Switch to Unicode and use the TnT Unicode controls to work with this file.
    2. Read the file as an array of bytes. Convert to ANSI and then continue using ANSI encoded text. Obviously you'll lose information for any characters than cannot be encoded by your ANSI code page. A cheap way to do this would be to read the file as a byte array. Copy the content after the first two bytes, the BOM, into a WideString. Then assign that WideString to an ANSI string.
    3. Port your program to a Unicode version of Delphi (anything later than Delphi 2007) and work natively with Unicode.

    I rather suspect that you are not very familiar with text encodings. If you were then I think you would have been able to answer the question yourself. That's just fine but I urge you to take the time to learn about this issue properly. If you rush into coding now, before having a sound grounding, you are sure to make a mess of it. And we've seen so many people make that same mistake. Please don't add to the list of text encoding casualties.