Search code examples
powershellpipeline

Does ForEach-Object operate on a single object in the pipeline or on a collection of objects?


I've had trouble grasping how the PowerShell pipeline works and I realise a lot of the problem is due to ForEach-Object. In other languages I've used, foreach operates on a collection, iterating through each element of the collection in turn. I assumed ForEach-Object, when used in a PowerShell pipeline, would do the same. However, everything I read about the pipeline suggests each element of a collection is passed through the pipeline separately and that downstream cmdlets are called repeatedly, operating on each element separately rather than on the collection as a whole.

So does ForEach-Object operate on a single element in the collection, rather than on the collection as a whole? Looking at it a different way, does the pipeline operator pass through the whole collection to ForEach-Object, which then iterates over it, or does the pipeline object iterate over the collection and pass each element separately to ForEach-Object?


Solution

  • The ForEach-Object cmdlet - unlike the foreach statement - itself performs no enumeration.

    Instead, it operates on each item passed through the pipeline (with the option to also execute code before receiving the first and after receiving the last item, if any).

    Therefore, it is arguably poorly named, given that it is the pipeline that provides the enumeration (by default), and that ForEach-Object simply invokes a script block for each item received.

    The following examples illustrate this:

    # Let the pipeline enumerate the elements of an array:
    > 1, 2 | ForEach-Object { "item: [$_]; count: $($_.Count)" }
    item: [1]; count: 1
    item: [2]; count: 1
    
    # Send the array *as a whole* through the pipeline (PSv4+)
    > Write-Output -NoEnumerate 1, 2 | ForEach-Object { "item: [$_]; count: $($_.Count)" }
    item: [1 2]; count: 2
    

    Note that scripts / functions / cmdlets can choose whether a collection they write to the output stream (pipeline) should be enumerated or sent as a whole (as a single object).

    In PowerShell code (scripts or functions, whether advanced (cmdlet-like) or not, enumeration is the default, but you can opt out with Write-Output -NoEnumerate; the -NoEnumerate switch was introduced in PSv4; prior to that, you had to use $PSCmdlet.WriteObject(), which is only available to advanced scripts / functions.

    Also note that embedding a command in an expression by enclosing it in (...) forces enumeration:

    # Send array as a whole.
    > Write-Output -NoEnumerate 1, 2 | Measure-Object
    
    Count: 1
    ...
    
    # Converting the Write-Output -NoEnumerate command to an expression
    # by enclosing it in in (...) forces enumeration
    > (Write-Output -NoEnumerate 1, 2) | Measure-Object
    
    Count: 2
    ...