Search code examples
powershellcsvgetcontent

Powershell getcontent from arranged report


just new to powershell. I need an listed report taken from such below txt report. this is only some lines of example. I have more than 10000 lines such kind of this.

 PIPELINE REF LINENUMBER_A
 BATCH REF XXXA
 PIPING SPEC 01B

                                            N.S.         LENGTH                                     QTY     QTY
          BOLT DESCRIPTION                  (INS)        (MM)         ITEM CODE                     FAB     EREC
          ----------------                  ----         ------       ---------                     ---     ----

  70 STUD BOLTS & 2 HEAVY Hex Nuts          1/2          70           0597461                     0       4
     B18.2.1/B18.2.2 A320-L7M/A194-7M (S3)
     SPUN GLV TO F2329 INCH SIZE

1          31 Mar 2021                                                                                         Page     1

 PIPELINE REF LINENUMBER_B
 BATCH REF XXXB
 PIPING SPEC 01C

                                            N.S.         LENGTH                                     QTY     QTY
          BOLT DESCRIPTION                  (INS)        (MM)         ITEM CODE                     FAB     EREC
          ----------------                  ----         ------       ---------                     ---     ----

  90 STUD BOLTS & 2 HEAVY Hex Nuts          5/8          90           7452665                     0       16
     B18.2.1/B18.2.2 A320-B8M CL2
     (S1/S3)/A194-8MA INCH SIZE
  210 STUD BOLTS & 3 HH Nuts                1.1/8        210          9989993                     0       16
     B18.2.1/B18.2.2 A320-B8M CL2
     (S1/S3)/A194-8MA INCH SIZE-With a 3rd
     nut for tensioning

The result I expected as comma separated is enough as below:

LINENUMBER_A,01B,STUD BOLTS & 2 HEAVY Hex Nuts,1/2,70,0597461,0,4 LINENUMBER_B,01C,STUD BOLTS & 2 HEAVY Hex Nuts,5/8,90,7452665,0,16 LINENUMBER_B,01C,STUD BOLTS & 3 HH Nuts,1.1/8,210,9989993,0,16

I've tried

$p = @("PIPELINE","STUD BOLT")
Get-Content '.\bolt-out.bom' | Select-String -Pattern $p -SimpleMatch | Set-Content test0.txt

but the result is not as expected. anyone can help realy appreciate.

the result is below. I only may need something like concatenate of first array finding + second finding instead of new line number.

PIPELINE REF LINENUMBER_A
90 STUD BOLTS & 2 HEAVY Hex Nuts          5/8          90           7452665                     0       16

Solution

  • I would suggest parsing this into a nice CSV file with column headers:

    # read the file as single multilined string and split on `PIPELINE REF`
    $result = (Get-Content -Path '.\bolt-out.bom' -Raw) -split 'PIPELINE REF' | Where-Object {$_ -match '\S' } | ForEach-Object {
        # split the part in separate lines and trim to get rid of all leading and trailing spaces
        $lines = ($_ -split '\r?\n').Trim()
        $lines[0] = 'PIPELINE REF ' + $lines[0] # re-add 'PIPELINE REF'
        # create an empty item
        $item = '' | Select-Object LineRef,PipingSpec,Description,NS,Length,ItemCode,QTY_Fab,QTY_Erec,Total_QTY
        # parse line by line
        switch -Regex ($lines) {
            '^PIPELINE REF (\w+)' { $item.LineRef = $matches[1] }
            '^PIPING SPEC (\w+)'  { $item.PipingSpec = $matches[1] }
            '^(\d+.*)\s{2,}([^\s]+)\s{2,}([^\s]+)\s{2,}([^\s]+)\s{2,}([^\s]+)\s{2,}([^\s]+)' {
                $item.Description = $matches[1]
                $item.NS          = $matches[2]
                $item.Length      = $matches[3]
                $item.ItemCode    = $matches[4]
                $item.QTY_Fab     = [int]$matches[5]
                $item.QTY_Erec    = [int]$matches[6]
                $item.Total_QTY   = [int]$matches[5] + [int]$matches[6]
            }
        }
        # output the item
        $item
    }
    
    # output on screen
    $result
    
    # output to comma separated CSV file (WITH headers)
    $result | Export-Csv -Path '.\boltsandnuts.csv' -NoTypeInformation
    

    Output on screen should look like:

    LineRef     : LINENUMBER_A
    PipingSpec  : 01B
    Description : 70 STUD BOLTS & 2 HEAVY Hex Nuts        
    NS          : 1/2
    Length      : 70
    ItemCode    : 0597461
    QTY_Fab     : 0
    QTY_Erec    : 4
    Total_QTY   : 4
    

    Regex details of the long regex line:

    ^               Assert position at the beginning of the string
    (               Match the regular expression below and capture its match into backreference number 1
       \d           Match a single digit 0..9
          +         Between one and unlimited times, as many times as possible, giving back as needed (greedy)
       .            Match any single character that is not a line break character
          *         Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
    )              
    \s              Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
       {2,}         Between 2 and unlimited times, as many times as possible, giving back as needed (greedy)
    (               Match the regular expression below and capture its match into backreference number 2
       [^\s]        Match any character that is NOT a “A whitespace character (spaces, tabs, line breaks, etc.)”
          +         Between one and unlimited times, as many times as possible, giving back as needed (greedy)
    )              
    \s              Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
       {2,}         Between 2 and unlimited times, as many times as possible, giving back as needed (greedy)
    (               Match the regular expression below and capture its match into backreference number 3
       [^\s]        Match any character that is NOT a “A whitespace character (spaces, tabs, line breaks, etc.)”
          +         Between one and unlimited times, as many times as possible, giving back as needed (greedy)
    )              
    \s              Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
       {2,}         Between 2 and unlimited times, as many times as possible, giving back as needed (greedy)
    (               Match the regular expression below and capture its match into backreference number 4
       [^\s]        Match any character that is NOT a “A whitespace character (spaces, tabs, line breaks, etc.)”
          +         Between one and unlimited times, as many times as possible, giving back as needed (greedy)
    )              
    \s              Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
       {2,}         Between 2 and unlimited times, as many times as possible, giving back as needed (greedy)
    (               Match the regular expression below and capture its match into backreference number 5
       [^\s]        Match any character that is NOT a “A whitespace character (spaces, tabs, line breaks, etc.)”
          +         Between one and unlimited times, as many times as possible, giving back as needed (greedy)
    )
    \s          # Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
       {2,}        # Between 2 and unlimited times, as many times as possible, giving back as needed (greedy)
    (           # Match the regular expression below and capture its match into backreference number 6
       [^\s]       # Match any character that is NOT a “A whitespace character (spaces, tabs, line breaks, etc.)”
          +           # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
    )