Search code examples
c#c#-4.0consolesystem.io.file

Compare text files in C# and remove duplicate lines


1.txt:

Origination,destination,datetime,price

YYZ,YTC,2016-04-01 12:30,$550
YYZ,YTC,2016-04-01 12:30,$550
LKC,LKP,2016-04-01 12:30,$550

2.txt:

Origination|destination|datetime|price

YYZ|YTC|2016-04-01 12:30|$550
AMV|YRk|2016-06-01 12:30|$630
LKC|LKP|2016-12-01 12:30|$990

I have two text files with ',' and '|' as separators, and I want to create a console app in C# which reads these two files when I pass an origination and destination location from command prompt.

While searching, I want to ignore duplicate lines, and I want to display the results in order by price.

The output should be { origination } -> { destination } -> datetime -> price

Need help how to perform.


Solution

  • Here's a simple solution that works for your example files. It doesn't have any error checking for if the file is in a bad format.

    using System;
    using System.Collections.Generic;
    
    class Program
    {
        class entry
        {
            public string origin;
            public string destination;
            public DateTime time;
            public double price;
        }
    
        static void Main(string[] args)
        {
            List<entry> data = new List<entry>();
    
            //parse the input files and add the data to a list
            ParseFile(data, args[0], ',');
            ParseFile(data, args[1], '|');
    
            //sort the list (by price first)
            data.Sort((a, b) =>
            {
                if (a.price != b.price)
                    return a.price > b.price ? 1 : -1;
                else if (a.origin != b.origin)
                    return string.Compare(a.origin, b.origin);
                else if (a.destination != b.destination)
                    return string.Compare(a.destination, b.destination);
                else
                    return DateTime.Compare(a.time, b.time);
            });
    
            //remove duplicates (list must be sorted for this to work)
            int i = 1;
            while (i < data.Count)
            {
                if (data[i].origin == data[i - 1].origin
                    && data[i].destination == data[i - 1].destination
                    && data[i].time == data[i - 1].time
                    && data[i].price == data[i - 1].price)
                    data.RemoveAt(i);
                else
                    i++;
            }
    
            //print the results
            for (i = 0; i < data.Count; i++)
                Console.WriteLine("{0}->{1}->{2:yyyy-MM-dd HH:mm}->${3}",
                    data[i].origin, data[i].destination, data[i].time, data[i].price);
    
            Console.ReadLine();
        }
    
        private static void ParseFile(List<entry> data, string filename, char separator)
        {
            using (System.IO.FileStream fs = System.IO.File.Open(filename, System.IO.FileMode.Open))
            using (System.IO.StreamReader reader = new System.IO.StreamReader(fs))
                while (!reader.EndOfStream)
                {
                    string[] line = reader.ReadLine().Split(separator);
                    if (line.Length == 4)
                    {
                        entry newitem = new entry();
                        newitem.origin = line[0];
                        newitem.destination = line[1];
                        newitem.time = DateTime.Parse(line[2]);
                        newitem.price = double.Parse(line[3].Substring(line[3].IndexOf('$') + 1));
                        data.Add(newitem);
                    }
                }
        }
    }