I have searched but have not found my answer. Disclaimer: I am brand new to C# but I have a task at work to create the following program: Read from existing Log Files, Parse them by Tab, Limit the results to a specific status (Process E-mail), Group by Division (i.e. Investment Bank), then calculate statistics for amount of conversions of emails by division, and print to a new log file.
Wanted to give a bit of background on the program itself prior to asking the question. I am currently at the point where I would like to group by Division, and cant figure out how to do it.
EDIT: original data:
Status Division Time Run Time Zip Files Conversions Returned Files Total E-Mails
Process E-mail Investment Bank 12:00 AM 42.8596599 1 0 1 1
End Processing 12:05 AM 44.0945784 0 0 0 0
Process E-mail Investment Bank 12:10 AM 42.7193253 2 1 0 1
Process E-mail Treasury 12:15 AM 4.6563394 1 0 2 2
Here is the code that I have up to this point:
static void Main()
{
{
List<string> list = new List<string>();
using (StreamReader reader = new StreamReader(Settings.LogPath + "2012-3-10.log"))
{
string line;
int i = 0;
while ((line = reader.ReadLine()) != null)
{
list.Add(line);
i++;
string[] split = line.Split('\t');
string processing = split[0];
if(processing.StartsWith("Process"))
{
string division = split[1];
int zipFiles;
int.TryParse(split[4], out zipFiles);
int conversions;
int.TryParse(split[5], out conversions);
int returnedFiles;
int.TryParse(split[5], out returnedFiles);
int totalEmails;
int.TryParse(split[5], out totalEmails);
So I have the program to the point where it will spit out something to the console like this:
Investment Bank
1
0
1
1
Treasury
1
0
2
2
Investment Bank
2
1
0
1
What I am looking to do now, is group by "Investment Bank", "Treasury", etc and then be able to calculate the totals.
The final log file will look like this:
Division Zip Files Conversions Returned Files Total E-mails
Investment Bank 3 1 1 2
Treasury 1 0 2 2
The following code does what you need:
string filename = @"D:\myfile.log";
var statistics = File.ReadLines(filename)
.Where(line => line.StartsWith("Process"))
.Select(line => line.Split('\t'))
.GroupBy(items => items[1])
.Select(g =>
new
{
Division = g.Key,
ZipFiles = g.Sum(i => Int32.Parse(i[2])),
Conversions = g.Sum(i => Int32.Parse(i[3])),
ReturnedFiles = g.Sum(i => Int32.Parse(i[4])),
TotalEmails = g.Sum(i => Int32.Parse(i[5]))
});
Console.Out.WriteLine("Division\tZip Files\tConversions\tReturned Files\tTotal E-mails");
statistics
.ToList()
.ForEach(d => Console.WriteLine("{0}\t{1}\t{2}\t{3}\t{4}",
d.Division,
d.ZipFiles,
d.Conversions,
d.ReturnedFiles,
d.TotalEmails));
It could be even shorter (though less clear) if not to mess with anonymous classes but work with arrays instead. Let me know if you are intrested in such code.