Search code examples
c#csvexport-to-csvpowerpivotweblog

Converting log file to CSV


I have to convert a (Squid Web Proxy Server) log file to CSV file, so that it can be loaded into powerpivot for analysis of queries. So how should I start, any help would strongly be appreciated. I've to use C# language for this task, log looks like the following:

Format: Timestamp Elapsed Client Action/Code Size Method URI Ident Hierarchy/From Content

1473546438.145    917 5.45.107.68 TCP_DENIED/403 4114 GET http://atlantis.pennergame.de/pet/ - NONE/- text/html
1473546439.111      3 146.148.96.13 TCP_DENIED/403 4604 POST http://mobiuas.ebay.com/services/mobile/v1/UserAuthenticationService - NONE/- text/html
1473546439.865    358 212.83.168.7 TCP_DENIED/403 3955 GET http://www.theshadehouse.com/left-sidebar-post/ - NONE/- text/html
1473546439.985    218 185.5.97.68 TCP_DENIED/403 3600 GET http://www.google.pl/search? - NONE/- text/html
1473546440.341      2 146.148.96.13 TCP_DENIED/403 4604 POST http://mobiuas.ebay.com/services/mobile/v1/UserAuthenticationService - NONE/- text/html
1473546440.840    403 115.29.46.240 TCP_DENIED/403 4430 POST http://et.airchina.com.cn/fhx/consumeRecord/getCardConsumeRecordList.htm - NONE/- text/html
1473546441.486      2 52.41.27.39 TCP_DENIED/403 3813 POST http://www.deezer.com/ajax/action.php - NONE/- text/html
1473546441.596      2 146.148.96.13 TCP_DENIED/403 4604 POST http://mobiuas.ebay.com/services/mobile/v1/UserAuthenticationService - NONE/- text/html

Solution

  • It is already close to a CSV, so read it line by line and clean each line up a little:

    ...
    line = line
      .Replace("   ", " ")  // compress 3 spaces to 1
      .Replace("  ", " ")   // compress 2 spaces to 1
      .Replace("  ", " ")   // compress 2 spaces to 1, again
      .Replace(" ", "|")    // replace space by '|'
      .Replace(" - ", "|"); // replace  -  by '|'
    

    You may want to tweak this for the fields like TCP_DENIED/403 .

    this gives you a '|' separated line. Easy to convert to any separator you need. Or split it up:

    // write it out or process it further    
    string[] parts = line.split('|');