Search code examples
powershellcsvsplit

Adding progress indicator to the powershell script - split CSV


I need to split into smaller pieces huge CSV file. I'd like to check if PowerShell script is able to do it. I have found a script on the network (https://www.spjeff.com/2017/06/02/powershell-split-csv-in-1000-line-batches/), but would like to add progress indicator to it to make sure that it is running. It was running for several hrs, but no output file was generated.

Please let me know, where in the script I could add a logic for progress indicator for reading the content of the file and for split progress.


Solution

  • For splitting Csv files, I would recommend to use the steppable pipeline (for a complete explanation, see: Mastering the (steppable) pipeline) which would also allow you to easily intergrade a Write-Progress bar.

    Note that the Write-Progress cmdlet is pretty expensive by itself
    (especially in Windows PowerShell 5.1).

    Import-Csv .\Your.csv | Foreach-Object -Begin {
        $ExpectedBatches = 1000
        $Index = 0
        $BatchSize = 1000
    } -Process {
        if ($Index % $BatchSize -eq 0) {
            $BatchNr = [math]::Floor($Index++/$BatchSize)
            Write-Progress -Activity "Processing" -Status "Batch number: $BatchNr" -PercentComplete ($BatchNr * 100 / $ExpectedBatches)
            $Pipeline = { Export-Csv -notype -Path .\Batch_$BatchNr.csv }.GetSteppablePipeline()
            $Pipeline.Begin($True)
        }
        $Pipeline.Process($_)
        if ($Index++ % $BatchSize -eq 0) { $Pipeline.End() }
    } -End {
        $Pipeline.End()
    }