Search code examples
powershellperformanceio

Powershell foreach read write slow


I have a few hundred files that are around 1.5 MB each. I need to run the files against the below loop, but it is very slow. Each file takes about 5 minutes to loop through. Is there a faster way?

function Convert-File($inputFile,$outputFile,$dataDate)
{
if ([string]::IsNullOrEmpty($dataDate)) 
{
$dataDate = $inputFile.split('.') | select -last 1
}
Write-Host "File data date is $dataDate"
#Get-Content $inputFile | Select-String -pattern $dataDate | Out-File $outputFile
$header=""
$headerOut=$false
if (Test-Path $outputFile) 
{
  Remove-Item $outputFile
}
foreach($line in [System.IO.File]::ReadLines($inputFile))
{
    if ($line.StartsWith("!"))
    {
        $header=$line
        continue
    }
    if ($line.Contains($dataDate))
    {
        if (!$headerOut) 
        {
        $headerOut=$true
        #Write-Host $header
        Set-Content -Path $outputFile -Value $header.substring(1).Replace('|',',') -Force
        }
        if ([string]::IsNullOrEmpty($line)) { continue }
        #Write-Host $line
        Add-Content $outputFile $line.Replace('|',',') -force
    }
}
}

The code works but I would like the code to perform faster. Any suggestions?


Solution

  • Add-Content is the bottleneck in your code, opening and closing a FileStream on each loop iteration is very expensive. This operation should be done only once.

    Also, worth noting [string]::IsNullOrEmpty( ) should be the first condition of your loop and, most likely you want to use [string]::IsNullOrWhiteSpace( ) instead, though I'll leave that up to you to decide.

    This is how your final loop should loop using a StreamWriter:

    try {
        foreach($line in [System.IO.File]::ReadLines($inputFile)) {
            if ([string]::IsNullOrEmpty($line)) {
                continue
            }
            if ($line.StartsWith('!')) {
                $header = $line
                continue
            }
            if ($line.Contains($dataDate)) {
                if (-not $headerOut) {
                    $headerOut = $true
    
                    $fs     = (New-Item $outputFile -Force).OpenWrite()
                    $writer = [System.IO.StreamWriter] $fs
                    $writer.WriteLine($header.SubString(1).Replace('|', ','))
                }
    
                $writer.WriteLine($line.Replace('|', ','))
            }
        }
    }
    finally {
        $writer, $fs | ForEach-Object Dispose
    }