Search code examples
regexpowershellexport-to-csv

Regex search and generate output in csv


I am working on below requirement. I have below input file and trying to extract specific information using regex and create a output csv file with all matches.

Sample file

[2022-05-31T16:56:25.551558-04:00] [XFM] [TRACE:1] [EPMHFM-00000] [XFM] [ecid: XDS.0000.0000.0000.0001] [File: c:\jenkins\workspace\hfm_11.2.8_rue_build\hfm\source\xfmdatasourceroot\xfmdatasource\xfmdatasource.cpp] [Line: 601] [userId: ] [Msg arguments: ] [appName: COMMAPS4] [pid: 8364] [tid: 14160] [host: win2019standard] [nwaddr: [fe80::3d7e:f4de:2f83:8e30%4]:0;10.65.51.209:0;] [errorCode: 0] [srcException: 0] [errType: 1] [dbUpdate: 2] [11.2.8.1.000.9]  [[XDS: XFMDataSource process starting...
 
[2022-08-28T20:04:39.037507-04:00] [XFM] [TRACE:1] [EPMHFM-00000] [XFM] [ecid: ] [File: c:\jenkins\workspace\hfm_11.2.6_build\hfm\source\xfmdatasourceroot\xfmdatasource\xfmdatasource.cpp] [Line: 489] [userId: ] [Msg arguments: ] [appName: COMMAPS4] [pid: 11496] [tid: 9268] [host: Win2019standard] [nwaddr: [fe80::dca:9652:390a:7031%11]:0;10.199.36.35:0;] [errorCode: 0] [srcException: 0] [errType: 1] [dbUpdate: 2] [11.2.6.0.000.38]  [[XDS: XFMDataSource process exiting now ...

Below is the powershell with regex used to extract desired values.

Get-Content -Path "C:\Users\test.txt"  |select-String -pattern "(\d{4}\-\d{2}\-\d{2})|`(\d\d:\d\d:\d\d|host+:\s([a-zA-Z \d]+))|appName+:\s([a-zA-Z \d]+)|`(11.\d.\d.\d.\d\d\d.\d\d)|pid+:\s([a-zA-Z \d]+)|XDS+:\s([a-zA-Z \d]+)"-AllMatches|ForEach-Object {$_.Matches.value} |ForEach-Object {$_.Groups[1].Value}

Sample output:

2022-05-31

16:56:25

appName: COMMAPS4

pid: 8364

host: win2019standard

XDS: XFMDataSource process starting

========

Goal is create CSV file with below format with all the results.

  Date ,      Time  ,  Appname ,  PID ,    Host    ,       Message 

2022-05-31, 16:56:25, COMMAPS4, 8364,win2019standard, XFMDataSource starting

Tried outfile and export-Csv which is not giving desired outputs

$content =Get-Content -Path "C:\Users\test.txt"
$regex = '(\d{4}\-\d{2}\-\d{2})|(\d\d:\d\d:\d\d|host+:\s([a-zA-Z \d]+))|appName+:\s([a-zA-Z \d]+)|(11.\d.\d.\d.\d\d\d.\d\d)|pid+:\s([a-zA-Z \d]+)|XDS+:\s([a-zA-Z \d]+)'
[regex]::Matches($content, $regex) | % {  
    [PSCustomObject]@{
            Date = $_.Groups.Value[1]
            Time = $_.Groups.Value[2]
             Server= $_.Groups.Value[3]
             Application= $_.Groups.Value[4]
             Version= $_.Groups.Value[5]
             PID= $_.Groups.Value[6]
             Message= $_.Groups.Value[7]
            }
 }  | Export-Csv -Path C:\Users\test.csv

Solution

  • It looks like you could accomplish it using this regex. Using the sample in question the objects would look like this:

    Date       Time     AppName  PID   Host            Message
    ----       ----     -------  ---   ----            -------
    2022-05-31 16:56:25 COMMAPS4 8364  win2019standard XFMDataSource process starting
    2022-08-28 20:04:39 COMMAPS4 11496 Win2019standard XFMDataSource process exiting now
    

    Code:

    $re = [regex] @'
    (?xi)
      (?<Date>\d{4}(?:-\d{2}){2}).*?
      (?<Time>\d{2}(?::\d{2}){2}).*?appName[\s:]*
      (?<AppName>[\w ]+).*?pid[\s:]*
      (?<PID>\d+).*?host[\s:]*
      (?<Host>[\w ]+).*?XDS[\s:]*
      (?<Message>[\w ]+)
    '@
    
    $log = Get-Content path\to\log.txt -Raw
    
    $re.Matches($log) | ForEach-Object {
        $out = [ordered]@{}
        foreach($group in $_.Groups) {
            if($group.Name -eq 0) { continue }
            $out[$group.Name] = $group.Value
        }
        [pscustomobject] $out
    } | Export-Csv path\to\export.csv -NoTypeInformation