Search code examples
powershellscriptblock

Predicate script blocks and collections?


Function Invoke-Keep {
    <#
    .SYNOPSIS
    Implement the keep operation on collections, including nested arrays.
    
    .DESCRIPTION
    Given a collection (which may contain nested arrays), apply the predicate to each element and return an array of elements where the predicate is true.

    .PARAMETER Data
    Collection of data to filter through, can be nested arrays.

    .PARAMETER Predicate
    The predicate to operate on the data.

    .EXAMPLE
    $IsSumGreaterThan10 = { param($row) ($row | Measure-Object -Sum).Sum -gt 10 }
    Invoke-Keep -Data @(@(1, 2, 3), @(5, 6), @(10, 5), @(7, 3)) -Predicate $IsSumGreaterThan10
    Return: @(@(5, 6), @(10, 5), @(7, 3))
    #>

In trying to create the above function (and testing it with the predicate and data in .EXAMPLE), I found case 1 that fails and case 2 that succeeds but am unclear on why:

Case 1:

[CmdletBinding()]
    Param(
        [Object[]]$Data,
        [ScriptBlock]$Predicate
    )
    foreach ($item in $Data) {
        if ($Predicate.Invoke($Item)) {
            return $item
        }
    }

Case 2:

[CmdletBinding()]
    Param(
        [Object[]]$Data,
        [ScriptBlock]$Predicate
    )
   return $Data | Where-Object {& $Predicate $_}
}

Case 1 seemingly works fine with flat data but returns nothing when passed nested arrays. Case 2 handles nested arrays fine. But whwhwhwhwhwhwhwhwy?!

This might already be answered but I'm sufficiently dumb as to not have the language to even articulate my question in a search box.


Solution

  • Re case 1:

    Replace:

            if ($Predicate.Invoke($Item)) {
                return $item
            }
    

    with:

            if (& $Predicate $item) {
              , $item
            }
    
    • Your primary problem was that the .Invoke() method expects an array of arguments, and if you want to pass what happens to be an array as a single argument, you'd have to wrap that argument in a single-element helper array.

      • However, using &, the call operator, to invoke your script block with argument(s) makes this unnecessary, and using & is generally preferable in PowerShell code.
    • A secondary problem was the use of return inside your foreach block, which would have stopped processing after finding one match.

      • Omitting return solves that problem in principle but note that if $item by itself were sent to the success output stream, it would be subject to auto-enumeration and would output the elements of each matching nested array one by one.

      • The use of the unary form of , the array-constructor ("comma") operator in effect prevents this.

        • A conceptually clearer, but slower alternative is to use
          Write-Output -NoEnumerate $item.
        • See this answer for background information.

    The reason that case 2 works is that using your $Data array as pipeline input causes its enumeration, so that the elements - which are nested arrays in your case - are passed one by one to Where-Object, so that the automatic $_ variable in the script block properly refers to each nested array.