Search code examples
c#ansi-c

How can I read a text file and loop though repeating sections?


I have an ANSI 835 (text) file. For simplicity's sake it looks like this:

ISA*00
GS*Foo*12345
ST*835*000001
LX*1
CLP*123456
NM1*Lastname
REF*010101
DTM*20120512
SVC*393939
LQ*19
LX*2
CLP*23456
NM1*Smith
REF*58774
DTM*20120601
SVC*985146
LX*3
CLP*34567
NM1*Doe
REF*985432
DTM*20121102
SVC*864253
LQ*19
LQ*84

The records are broken up into LX segments. Everything after LX*1 is one record, everything after LX*2 is another record, and so on. I need to get certain items from each line, assign them to variables, and eventually add them as a row to a datagridview. Again for simplicity's sake, I have the following variables and here's what should go in each:

string ItemNumber should be the group of characters after the * in the CLP line
string LastName should be the group of characters after the * in the NM1 line
string Date should be the group of characters after the * in the REF line
string Error should be the group of characters after the * in the LQ line

The biggest problem I'm facing is that there may be more than one LQ line in each LX segment. In that case, the 2nd error can just be added to the end of the first error, separated by a comma.

I tried loading the file into a string array and going line by line, but I'm not sure how say "start at LX*1 and do stuff until you hit LX*2".

string[] lines = File.ReadAllLines(MyFile);

foreach (string line in lines)
{
    string[] splitline = line.Split('*');

    if (splitline[0] = "LX")
    {
        //this is where i need to loop through the next lines 
        //until i hit the next line starting with LX.
    }
}

Any ideas? As always, thank you for your time!


Solution

  • Start with a simple data model:

    public class LXRecord
    {
        public string ItemNumber { get; set; }
        public string LastName { get; set; }
        public string Date { get; set; }
        public List<string> Errors { get; set; }
    
        public LXRecord()
        {
            Errors = new List<String>();
        }
    }
    

    Define your significant tokens:

    public static class Tokens
    {
        public const string TOKEN_SPLITTER = "*";
        public const string NEW_RECORD = "LX";
        public const string ITEM_NUMBER = "CLP";
        public const string LAST_NAME = "NM1";
        public const string DATE = "REF";
        public const string ERROR = "LQ";
    }
    

    Loop through the lines, do a switch/case on the tokens, and just start a new LXRecord when you see the "LX" flag:

    List<LXRecord> records = new List<LXRecord>();
    LXRecord currentRecord = null;
    
    foreach(string line in lines)
    {
        int tokenIndex = line.IndexOf(Tokens.TOKEN_SPLITTER);
        if (tokenIndex < 1 || tokenIndex == line.Length - 1) //no token or no value?
            continue;
    
        string token = line.Substring(0, tokenIndex);
        string value = line.Substring(tokenIndex + 1);
    
        switch(token)
        {
            case(Tokens.NEW_RECORD) :
                currentRecord = new LXRecord();
                records.Add(currentRecord);
                break;
            case(Tokens.ITEM_NUMBER) :
                currentRecord.ItemNumber = value;
                break;
            case(Tokens.LAST_NAME) :
                currentRecord.LastName = value;
                break;
            case(Tokens.DATE) :
                currentRecord.Date = value;
                break;
            case(Tokens.ERROR) :
                currentRecord.Errors.Add(value);
                break;
        }
    }
    

    Notice this way you can relatively easily ignore non-supported flags, add new flags, or add parsing (for example, ItemNumber could use Int32.Parse and store it as an integer, or "Date" could store a DateTime) In this case, I chose to store the errors as a List<String>, but you could comma delimit it instead if you wish. I also avoided splitting on the * character in case the content contained a second asterisk as well.

    EDIT: From your comment, you can have some more complicated/specialized parsing in the case or moved into another method. Instead of the case I have above for "LAST_NAME", you could have:

    case(Tokens.LAST_NAME) :
        ParseName(currentRecord, value);
        break;
    

    Where ParseName is:

    public static void ParseName(LXRecord record, string value)
    {
        int tokenIndex = value.IndexOf(Tokens.TOKEN_SPLITTER);
        if (tokenIndex < 1 || tokenIndex == value.Length - 1) //no last name and first name?
        {
            record.LastName = value;
        }
        else
        {
            record.LastName = value.Substring(0, tokenIndex);
            record.FirstName = value.Substring(tokenIndex + 1);
        }
    }
    

    The token check might be tweaked there, but it should give you a good idea.