Search code examples
stringpowershellexport-to-csv

Powershell: Extract several strings from txt and create table out of it


I need to create a csv file out of values that are spread over many txt files. Here is an example for one of the txt files (they are all formatted the same way and stored in one folder, lets say c:\user\txtfiles):

System: asdf
Store: def
processid: 00001
Language: english
prodid: yellowshoes12
email: asdf@asdf.com
prodid: blueshoes34
some
other
text blabla

The result csv should look like this (i added values from another sample txt just to make it clear):

processid, prodid
00001, yellowshoes12
00001, blueshoes34
00002, redtshirt12
00002, greensocks34

That means that every product ID in the txt should be assigned to the one processid in the txt and added as single line to the csv.

I tried to reach the result as follows:

$pathtofiles = Get-ChildItem  c:\user\txtfiles | select -ExpandProperty FullName
$parsetxt = $pathtofiles | 
    ForEach { 
        $orderdata = Import-Csv $_  |
        Where-Object {($_ -like '*processid*') -or ($_ -like '*prodid*')} |
        foreach {
        
        write-output $orderdata -replace 'processid: ','' -replace 'prodid: ',''
        }
    }
      $orderdata

So my intention was to isolate the relevant lines, delete everything that is not wanted, assign the values to variables and build a table out of it. One problem is that if I replace $orderdata from the end of the code into the end of the first foreach-loop nothing is printed. But after deliberating quite a while I am not sure if my approach is a good one anyway. So any help would be very appreciated!

Daniel


Solution

  • I think this is best done using a switch -Regex -File construct while iterating over the files in your folder.

    # get the files in the folder and loop over them
    $result = Get-ChildItem -Path 'c:\user\txtfiles' -Filter '*.txt' -File | ForEach-Object {
        # the switch processes each line of a file and matches the regex to it
        switch -Regex -File $_.FullName {
            '^processid:\s+(\d+)' { $id = $matches[1] }
            '^prodid:\s+(\w+)' { [PsCustomObject]@{'processid' = $id; 'prodid' = $matches[1]}}
        }
    } | Sort-Object processid, prodid
    
    # output on console screen
    $result
    
    # output to CSV file
    $result | Export-Csv -Path 'c:\user\txtfiles\allids.csv'
    

    Result on screen:

    processid prodid       
    --------- ------       
    00001     blueshoes34  
    00001     yellowshoes12
    00002     greenshoes56 
    00002     purpleshoes88