Search code examples
.netpowershellnullscalar

Why does $null behave differently in ForEach-Object {} vs foreach()?


Considering the following examples:

$null

foreach ($n in $null) {'This is a $null test'}
(no output)

$null | ForEach-Object {'This is a $null test'}
This is a $null test

$null -in $null
True

$null -contains $null
True

 

[int]1

foreach ($n in [int]1) {'Test'}
Test

[int]1 | ForEach-Object {'Test'}
Test

[int]1 -in [int]1
True

[int]1 -contains [int]1
True


Since $null is a scalar value that contains nothing, in the second $null example, $null will send a single instance of 'nothing' down the pipeline, which explains the output.

Why can't $null be iterated over as a collection even though $null -in $null returns True?

  • The performance gains from not iterating over evaluated null objects/values seem obvious, but why does PowerShell treat scalar variables/values as collections?
  • Does this behavior persist through all of .NET and the .NET Framework?
  • Is there type coercion happening or do I have an fundamental misunderstanding about how collections/lists are structured or behave?

Solution

  • tl;dr

    The asymmetry you've observed is definitely unfortunate - but is unlikely to get resolved, so as not to break backward compatibility (see the bottom section for a potential solution).


    Both
    <commandOrExpression> | ... (pipeline) and
    foreach ($var in <commandOrExpression>) { ... }
    are enumeration contexts:

    • If <commandOrExpression> evaluates to an object that is enumerable,[1] it is automatically enumerated and the enumerated elements are processed one by one.

    • If it doesn't, the result is processed as itself; in a manner of speaking: it is treated like a single-element enumerable.


    • $null, which corresponds to null in C#, is a special something that happens to represent a "scalar (single) nothing".

    • This contrasts with the enumerable null - technically the [System.Management.Automation.Internal.AutomationNull]::Value singleton, aka "Automation Null" - which is what you get when a command produces no output.

      • It is a PS-specific representation of an enumerable that is nothing in itself and also has no elements: Its purpose is to signal "I represent nothing - I am not an object myself (unless I'm forced to act as one (as a scalar) in expressions, in which case I'll pretend to be $null), and enumerating me produces no elements".
    • As an aside: as of PowerShell 7.4.x it is still nontrivial to determine if a given stored value is an actual $null or an enumerable null (a distinction that situationally does matter):

      # Only $true if $someValue contains the enumerable null 
      $null -eq $someValue -and @($someValue).Count -eq 0
      
      • The following future enhancement has been green-lit, but has not yet been implemented - see GitHub issue #13465:

        # !! Future enhancement - doesn't work as of v7.4.x
        $someValue -is [System.Management.Automation.AutomationNull]
        

    Therefore, both the foreach loop and the pipeline:

    • fortunately do not perform enumeration if the enumerable null is provided as input, given that its very purpose is to signal that there is nothing to enumerate.

      # Capture the output of a command that produces NO output.
      $noOutput = Get-Item *NoSuchFile*
      
      # No output, because the loop body is not executed.
      foreach ($obj in $noOutput) { 'This does not print.' }
      
      # Ditto.
      $noOutput | ForEach-Object { 'This does not print' }
      
    • should process $null as itself, and therefore result in a single processing iteration with $null as the (non-)object to process:

      • Fortunately, this is how it works in the pipeline, as evidenced by your sample call to ForEach-Object cmdlet in your question.

      • Unfortunately, this is not how the foreach language statement behaves, which unexpectedly does not process the $null, as you've also observed.


    As for the reason for this asymmetry:

    • Up to version 2 of Windows PowerShell, assigning the output from a command that produced no output to a variable implicitly - and unexpectedly - converted the enumerable null (representing the lack of output) to $null, which caused the following statements not to be equivalent:

      # DIRECT processing of command output - the loop body is NOT entered.
      foreach ($obj in Get-Item *NoSuchFile*) { 'This does not print' }
      
      # INDIRECT processing of command output, via an *intermediate variable*.
      $output = Get-Item *NoSuchFile*
      # In v2, $output was $null, not the enumerable null,
      # causing the loop body to execute (once, with $obj containing $null)
      # In v3+, this is now - fortunately - equivalent to the above.
      foreach ($obj in $output) { 'This does not print' }
      
    • In v3, two fundamental changes were made:

      • Variables now do store the enumerable null as such, without quietly converting it to $null.
      • It was decided to ignore $null as foreach input:[2]
    • Given the former v3 change, implementing the latter was not strictly necessary, but presumably:

      • The latter change was motivated by also making non-existent variables result in no processing.

      • The assumption was made that if a pipeline is used, intermediate and therefore potentially non-existent variables aren't needed, in which case the problem doesn't arise.

    • However, the problem (still) does arise in the pipeline, namely if non-existent variables or non-existent properties of existing objects are used as input, because they evaluate to $null and therefore (justifiably) cause $null to be sent through the pipeline.
      Also, passing the enumerable null as an argument to a function or script's untyped parameter causes quiet conversion to $null.


    It follows from the above that a potential solution that should only minimally impact backward compatibility, if at all, is the following:

    • Make references to non-existent variables or properties default to the enumerable null rather than $null, which would implicitly prevent enumeration in both the pipeline and in foreach statements.
      Additionally, preserve the enumerable null as function / script arguments when binding to untyped parameters.

      • In expression contexts, the enumerable null behaves like $null, so something like $null -eq $noSuchVariable would continue to work.
    • Make foreach too process actual $null values as such, as the pipeline already does, which would eliminate the asymmetry.


    [1] Specifically, types that implement the IEnumerable interface or its generic counterpart are automatically enumerated, but there are a few exceptions: strings, dictionaries (types implementing IDictionary or its generic counterpart), and System.Xml.XmlNode instances. Additionally, System.Data.DataTable is enumerated too, despite not implementing IEnumerable itself, via its .Rows property. See this answer for details.

    [2] This change in behavior in v3 was announced on the PowerShell team blog in 2012: New V3 Language Features, section, section "ForEach statement does not iterate over $null",