In PowerShell, the normal way of redirecting standard input to a file is to pipe the contents of the file:
Get-Content input-file.txt | Write-Host
However, if the file is very large, PowerShell begins to consume a large amount of memory. Using a small -ReadCount
seems to speed up how quickly Get-Content
starts feeding rows into the command, but the memory consumption is still large.
Why is the memory usage so high? Is it that PowerShell is retaining the contents of the file in memory, even though it doesn't need to? Is there some way to mitigate that?
The following function will read the file in line by line by using the .NET StreamReader class and send each line down the pipeline. Sending this to Out-Null
my memory usage only went up by a few 10's of KB while it was executing on a nearly 2,000,000 line log file (~186 MB):
function Get-ContentByLine {
param (
[Parameter(Mandatory=$true,ValueFromPipeline=$true)][PsObject]$InputObject
)
begin {
$line = $null
$fs = [System.IO.File]::OpenRead($InputObject)
$reader = New-Object System.IO.StreamReader($fs)
}
process {
$line = $reader.ReadLine()
while ($line -ne $null) {
$line
$line = $reader.ReadLine()
}
}
end {
$reader.Dispose();
$fs.Dispose();
}
}
You would invoke it like this:
PS C:\> Get-ContentByLine "C:\really.big.log" | Out-Null