Search code examples
multithreadingpowershellrunspace

Powershell | How can I use Multi Threading for my File Deleter Powershell script?


So I've written a Script to delete files in a specific folder after 5 days. I'm currently implementing this in a directory with hundreds of thousands of files and this is taking a lot of time.

This is currently my code:

#Variables
$path = "G:\AdeptiaSuite\AdeptiaSuite-6.9\AdeptiaServer\ServerKernel\web\repository\equalit\PFRepository"
$age = (Get-Date).AddDays(-5) # Defines the 'x days old' (today's date minus x days)

# Get all the files in the folder and subfolders | foreach file
Get-ChildItem $path -Recurse -File | foreach{

    # if creationtime is 'le' (less or equal) than $age
    if ($_.CreationTime -le $age){
        Write-Output "Older than $age days - $($_.name)"
        Remove-Item $_.fullname -Force -Verbose # remove the item
    }

    else{
        Write-Output "Less than $age days old - $($_.name)"
    }
}

I've searched around the internet for some time now to find out how to use Runspaces, however I find it very confusing and I'm not sure how to implement it with this script. Could anyone please give me an example of how to use Runspaces for this code?

Thank you very much!

EDIT: I've found this post: https://adamtheautomator.com/powershell-multithreading/

And ended up changing my script to this:

    $Scriptblock = {
    # Variables
    $path = "G:\AdeptiaSuite\AdeptiaSuite-6.9\AdeptiaServer\ServerKernel\web\repository\equalit\PFRepository"
    $age = (Get-Date).AddDays(-5) # Defines the 'x days old' (today's date minus x days)

    # Get all the files in the folder and subfolders | foreach file
    Get-ChildItem $path -Recurse -File | foreach{

        # if creationtime is 'le' (less or equal) than $age
        if ($_.CreationTime -le $age){
            Write-Output "Older than $age days - $($_.name)"
            Remove-Item $_.fullname -Force -Verbose # remove the item
        }

        else{
            Write-Output "Less than $age days old - $($_.name)"
        }
    }
}

$MaxThreads = 5
$RunspacePool = [runspacefactory]::CreateRunspacePool(1, $MaxThreads)
$RunspacePool.Open()
$Jobs = @()

1..10 | Foreach-Object {
    $PowerShell = [powershell]::Create()
    $PowerShell.RunspacePool = $RunspacePool
    $PowerShell.AddScript($ScriptBlock).AddArgument($_)
    $Jobs += $PowerShell.BeginInvoke()
}

while ($Jobs.IsCompleted -contains $false) {
    Start-Sleep 1
}

However I'm not sure if this works correctly now, I don't get any error's however the Terminal doesn't do anything, so I'm not sure wether it works or just doesn't do anything.

I'd love any feedback on this!


Solution

  • The easiest answer is: get PowerShell v7.2.5 (look in the assets for PowerShell-7.2.5-win-x64.zip), download and extract it. It's a no-install PowerShell 7 which has easy multithreading and lets you change foreach { to foreach -parallel {. The executable is pwsh.exe.


    But, if it's severely overloading the server, running it several times will only make things worse, right? And I think the Get-ChildItem will be the slowest part, putting the most load on the server, and so doing the delete in parallel probably won't help.

    I would first try changing the script to this shape:

    $path = "G:\AdeptiaSuite\AdeptiaSuite-6.9\AdeptiaServer\ServerKernel\web\repository\equalit\PFRepository"
    $age = (Get-Date).AddDays(-5)
    
    $logOldFiles = [System.IO.StreamWriter]::new('c:\temp\log-oldfiles.txt')
    $logNewFiles = [System.IO.StreamWriter]::new('c:\temp\log-newfiles.txt')
    
    Get-ChildItem $path -Recurse -File | foreach {
    
        if ($_.CreationTime -le $age){
            $logOldFiles.WriteLine("Older than $age days - $($_.name)")
            $_   # send file down pipeline to remove-item
        }
        else{
            $logNewFiles.WriteLine("Less than $age days old - $($_.name)")
        }
    } | Remove-Item -Force
    
    $logOldFiles.Close()
    $logNewFiles.Close()
    

    So it pipelines into remove-item and doesn't send hundreds of thousands of text lines to the console (also a slow thing to do).

    If that doesn't help, I would switch to robocopy /L and maybe look at robocopy /L /MINAGE... to do the file listing, then process that to do the removal.

    (I also removed the comments which just repeat the lines of code # removed comments which repeat what the code says. The code tells you what the code says # read the code to see what the code does. Comments should tell you why the code does things, like who wrote the script and what business case was it solving, what is the PFRepository, why is there a 5 day cutoff, or whatever.)