Search code examples
c#wpfstreamreaderend-of-line

How can I tell if there is an environment.newline at the end of StreamReader.Readline()


I am trying to read a text file line by line and create one line from multiple lines until the line read in has \r\n at the end. My data looks like this:

BusID|Comment1|Text\r\n
1010|"Cuautla, Inc. d/b/a 3 Margaritas VIII\n
State Lic. #40428210000   City Lic.#4042821P\n
9/26/14      9/14/14 - 9/13/15    $175.00\n
9/20/00    9/14/00 - 9/13/01    $575.00 New License"\r\n
1020|"7-Eleven Inc., dba 7-Eleven Store #20638\n
State Lic. #24111110126; City Lic. #2411111126P\n
SEND ISSUED LICENSES TO DALLAS, TX\r\n

I want the data to look like this:

BusID|Comment1|Text\r\n
1010|"Cuautla, Inc. d/b/a 3 Margaritas VIII State Lic. #40428210000   City Lic.#4042821P 9/26/14      9/14/14 - 9/13/15    $175.00 9/20/00    9/14/00 - 9/13/01    $575.00 New License"\r\n
1020|"7-Eleven Inc., dba 7-Eleven Store #20638 State Lic. #24111110126; City Lic. #2411111126P SEND ISSUED LICENSES TO DALLAS, TX\r\n

My code is like this:

FileStream fsFileStream = new FileStream(strInputFileName, FileMode.Open, 
FileAccess.Read, FileShare.ReadWrite);

using (StreamReader srStreamRdr = new StreamReader(fsFileStream))
{
    while ((strDataLine = srStreamRdr.ReadLine()) != null && !blnEndOfFile)
    {
        //code evaluation here
    }

I have tried:

if (strDataLine.EndsWith(Environment.NewLine))
{
    blnEndOfLine = true;
}

and

if (strDataLine.Contains(Environment.NewLine))
{
    blnEndOfLine = true;
}

These do not see anything at the end of the string variable. Is there a way for me to tell the true end of line so I can combine these rows into one row? Should I be reading the file differently?


Solution

  • You cannot use the ReadLine method of the StringReader because every kind of newline. both the \r\n and \n are removed from the input, a line is returned by the reader and you will never know if the characters removed are \r\n or just \n

    If the file is not really big then you can try to load everything in memory and do the splitting yourself into separate lines

    // Load everything in memory
    string fileData = File.ReadAllText(@"D:\temp\myData.txt");
    
    // Split on the \r\n (I don't use Environment.NewLine because it 
    // respects the OS conventions and this could be wrong in this context
    string[] lines = fileData.Split(new string[] { "\r\n"}, StringSplitOptions.RemoveEmptyEntries);
    
    // Now replace the remaining \n with a space 
    lines = lines.Select(x => x.Replace("\n", " ")).ToArray();
    
    foreach(string s in lines)
       Console.WriteLine(s);
    

    EDIT
    If your file is really big (like you say 3.5GB) then you cannot load everything in memory but you need to process it in blocks. Fortunately the StreamReader provides a method called ReadBlock that allows us to implement code like this

    // Where we store the lines loaded from file
    List<string> lines = new List<string>();
    
    // Read a block of 10MB
    char[] buffer = new char[1024 * 1024 * 10];
    bool lastBlock = false;
    string leftOver = string.Empty;
    
    // Start the streamreader
    using (StreamReader reader = new StreamReader(@"D:\temp\localtext.txt"))
    {
        // We exit when the last block is reached
        while (!lastBlock)
        {
            // Read 10MB
            int loaded = reader.ReadBlock(buffer, 0, buffer.Length);
    
            // Exit if we have no more blocks to read (EOF)
            if(loaded == 0) break;
    
            // if we get less bytes than the block size then 
            // we are on the last block 
            lastBlock = (loaded != buffer.Length);
    
            // Create the string from the buffer
            string temp = new string(buffer, 0, loaded);
    
            // prepare the working string adding the remainder from the 
            // previous loop
            string current = leftOver + temp;
    
            // Search the last \r\n
            int lastNewLinePos = temp.LastIndexOf("\r\n");
    
            if (lastNewLinePos > -1)
            {
                 // Prepare the working string
                 current = leftOver + temp.Substring(0, lastNewLinePos + 2);
    
                 // Save the incomplete parts for the next loop
                 leftOver = temp.Substring(lastNewLinePos + 2);
            }
            // Process the lines
            AddLines(current, lines);
        }
    }
    
    void AddLines(string current, List<string> lines)
    {
        var splitted = current.Split(new string[] { "\r\n" }, StringSplitOptions.RemoveEmptyEntries);
        lines.AddRange(splitted.Select(x => x.Replace("\n", " ")).ToList());
    }
    

    This code assumes that your file always ends with a \r\n and that you always get a \r\n inside a block of 10MB of text. More tests are needed with your actual data.