Search code examples
powershellforeach-object

Powershell: Why is my variable empty after using ForEach-Object -Parallel?


I am trying to gather data from several servers using ForEach-Object -Parallel. The variable I use is being populated within the loop, but when the loop finishes the variable is empty.

$DBDetails = "SELECT @@VERSION"

$VMs = ("vm1", "vm2", "vm3", "vm4", "vm5", "vm6", "vm7")
$DBInventory = @()

$scriptBlock = {
    $vm = $_
    $result = Invoke-Sqlcmd -ServerInstance $vm -Query $using:DBDetails
    $DBInventory += $result
    Write-Host "Added $($result.Count) rows from $($vm)"
}

$VMs | ForEach-Object -Parallel $scriptBlock
Write-Host "Number of elements in DBInventory: $($DBInventory.Count)"

I expect the last line to return the number of elements gathered within the loop that executed on the previous line. There should be a total of 7 elements, but I am left with none.

My result looks like this:

Added 1 rows from vm1
Added 1 rows from vm2
Added 1 rows from vm3
Added 1 rows from vm4
Added 1 rows from vm5
Added 1 rows from vm6
Added 1 rows from vm7
Number of elements in DBInventory: 0

Solution

  • ForEach-Object -Parallel causes execution of the loop body in a separate runspace, meaning you don't have direct access to the variables defined in the calling scope.

    To work around this, make two changes to your code:

    • Use a collection type other than a resizable array (below I've use a generic [List[psobject]])
    • Reference the variable from the caller's scope with the using: scope modifier and assign to a local inside the block

    The resulting local variable will then reference the same list-object in memory, and changes made to that list via its methods (Add(), Remove(), AddRange(), etc.) will be reflected anywhere else its referenced (including the original $DBInventory variable from your calling scope).

    $DBDetails = "SELECT @@VERSION"
    
    $VMs = ("vm1", "vm2", "vm3", "vm4", "vm5", "vm6", "vm7")
    $DBInventory = [System.Collections.Generic.List[psobject]]::new()
    
    $scriptBlock = {
        $vm = $_
        $inventory = $using:DBInventory
        
        $result = Invoke-Sqlcmd -ServerInstance $vm -Query $using:DBDetails
        $inventory.AddRange([psobject[]]$result)
        Write-Host "Added $($result.Count) rows from $($vm)"
    }
    
    $VMs | ForEach-Object -Parallel $scriptBlock
    Write-Host "Number of elements in DBInventory: $($DBInventory.Count)"
    

    As mklement0 notes, [List[psobject]] is not thread-safe - for production code you'll definitely want to pick a collection type that is, like for example a [System.Collections.Concurrent.ConcurrenBag[psobject]] - essentially an unordered list:

    $DBInventory = [System.Collections.Concurrent.ConcurrentBag[psobject]]::new()
    

    Beware that the ConcurrentBag type, as the name might suggest, does not preserve insertion order. If this is a problem, you may want to consider using a [ConcurrentDictionary[string,psobject[]]] - this way you can tie the query output back to the orignal input string:

    $DBInventory = [System.Collections.Concurrent.ConcurrentDictionary[string,psobject[]]]::new()
    

    Since another thread may (hypothetically) have added an entry for the same key since you dispatched your call to Add(), the ConcurrentDictionary type requires us to use it slightly differently than a regular dictionary or hashtable:

    $scriptBlock = {
        $vm = $_
        $inventory = $using:DBInventory
        
        $result = Invoke-Sqlcmd -ServerInstance $vm -Query $using:DBDetails
        $adder = $updater = { return Write-Output $result -NoEnumerate }
        $inventory.AddOrUpdate($vm, $adder, $updater)
        Write-Host "Added $($result.Count) rows from $($vm)"
    }
    

    Here, the concurrent dictionary will execute the $adder function on our behalf if the key doesn't already exist (otherwise it'll run the $updater), and the result will be assigned as the entry value.

    You can subsequently access the entry values the same way you would a hashtable:

    $DBInventory[$vms[-1]] # returns array containing the query results from the last VM in the list