Search code examples
powershellcsvlog-files

Filter logfile to create a csv report using PowerShell


I have a NetApp log output in a log file which is the below format.

DeviceDetails.log file content

  /vol/DBCXARCHIVE002_E_Q22014_journal/DBCXARCHIVE002_E_Q22014_journal    1.0t (1149038714880) (r/w, online, mapped)
    Comment: " "
    Serial#: e3eOF4y4SRrc
    Share: none
    Space Reservation: enabled (not honored by containing Aggregate)
    Multiprotocol Type: windows_2008
    Maps: DBCXARCHIVE003=33
    Occupied Size: 1004.0g (1077986099200)
    Creation Time: Wed Apr 30 20:14:51 IST 2014
    Cluster Shared Volume Information: 0x0 
    Read-Only: disabled
/vol/DBCXARCHIVE002_E_Q32014_journal/DBCXARCHIVE002_E_Q32014_journal  900.1g (966429273600)  (r/w, online, mapped)
    Comment: " "
    Serial#: e3eOF507DSuU
    Share: none
    Space Reservation: enabled (not honored by containing Aggregate)
    Multiprotocol Type: windows_2008
    Maps: DBCXARCHIVE003=34
    Occupied Size:  716.7g (769556951040) 
    Creation Time: Tue Aug 12 20:24:14 IST 2014
    Cluster Shared Volume Information: 0x0 
    Read-Only: disabled 

Wherein the output is of only 2 devices , it has more than x devices appended in the log file.

I just need 4 details from each module , The first line contains 3 needed details

Device Name : /vol/DBCXARCHIVE002_E_Q22014_journal/DBCXARCHIVE002_E_Q22014_journal

Total Capacity : 1.0t (1149038714880)

Status : (r/w, online, mapped)

And the 4th Detail I need is Occupied Size: 1004.0g (1077986099200)

So the CSV output should look like below : enter image description here

I am not just a beginner at coding and trying to achieve this with the below code, it does not help much though :/

$logfile = Get-Content .\DeviceDetails.log
$l1 = $logfile | select-string "/vol"
$l2 = $logfile | select-string "Occupied Size: " 

$objs =@()
$l1 | ForEach {
$o = $_ 
$l2 | ForEach {
    $o1 = $_
    $Object22 = New-Object PSObject -Property @{
        'LUN Name , Total Space, Status, Occupied Size'  = "$o"
        'Occupied Size'  = "$o1"           
    }           

}
$objs += $Object22  
}
$objs

Solution

  • $obj = $null # variable to store each output object temporarily
    Get-Content .\t.txt | ForEach-Object { # loop over input lines
      if ($_ -match '^\s*(/vol.+?)\s+(.+? \(.+?\))\s+(\(.+?\))') {
        # Create a custom object with all properties of interest,
        # and store it in the $obj variable created above.
        # What the regex's capture groups - (...) - captured is available in the
        # the automatic $Matches variable via indices starting at 1.
        $obj = [pscustomobject] @{
          'Device Name' = $Matches[1]
          'Total Space' = $Matches[2]
          'Status' = $Matches[3]
          'Occupied Size' = $null # filled below
        }
      } elseif ($_ -match '\bOccupied Size: (.*)') {
        # Set the 'Occupied Size' property value...
        $obj.'Occupied Size' = $Matches[1]
        # ... and output the complete object.
        $obj
      }
    } | Export-Csv -NoTypeInformation out.csv
    

    - Note that Export-Csv defaults to ASCII output encoding; change that with the -Encoding parameter.
    - To extract only the numbers inside (...) for the Total Space and Occupied Size columns, use
    $_ -match '^\s*(/vol.+?)\s+.+?\s+\((.+?)\)\s+(\(.+?\))' and
    $_ -match '\bOccupied Size: .+? \((.*)\)' instead.

    Note how this solution processes the input file line by line, which keeps memory use down, though generally at the expense of performance.


    As for what you tried:

    • You collect the entire input file as an array in memory ($logfile = Get-Content .\DeviceDetails.log)

    • You then filter this array twice into parallel arrays, containing corresponding lines of interest.

    • Things go wrong when you attempt to nest the processing of these 2 arrays. Instead of nesting, you must enumerate them in parallel, as their corresponding indices contain matching entries.

    • Additionally:

      • a line such as 'LUN Name , Total Space, Status, Occupied Size' = "$o" creates a single property named LUN Name , Total Space, Status, Occupied Size, which is not the intent.
      • in order to create distinct properties (to be reflected as distinct colums in CSV output), you must create them as such, which requires parsing the input into distinct values accordingly.