Search code examples
powershellpowershell-2.0text-parsing

Split & Trim in a single step


In PS 5.0 I can split and trim a string in a single line, like this

$string = 'One, Two, Three'
$array = ($string.Split(',')).Trim()

But that fails in PS 2.0. I can of course do a foreach to trim each item, or replace ', ' with ',' before doing the split, but I wonder if there is a more elegant approach that works in all versions of PowerShell? Failing that, the replace seems like the best approach to address all versions with a single code base.


Solution

  • TheMadTechnician has provided the crucial pointer in a comment on the question:

    Use the -split operator, which works the same in PSv2: It expects a regular expression (regex) as the separator, allowing for more sophisticated tokenizing than the [string] type's .Split() method, which operates on literals:

    PS> 'One, Two, Three' -split ',\s*' | ForEach-Object { "[$_]" }
    [One]
    [Two]
    [Three]
    

    Regex ,\s* splits the input string by a comma followed by zero or more (*) whitespace characters (\s).

    In fact, choosing -split over .Split() is advisable in general, even in later PowerShell versions.


    However, to be fully equivalent to the .Trim()-based solution in the question, trimming of leading and trailing whitespace is needed too:

    PS> ' One,   Two,Three  ' -split ',' -replace '^\s+|\s+$' | ForEach-Object { "[$_]" }
    [One]
    [Two]
    [Three]
    

    -replace '^\s+|\s+$' removes the leading and trailing whitespace from each token resulting from the split: | specifies an alternation so that the subexpressions on either side of it are considered a match; ^\s+, matches leading whitespace, \s+$ matches trailing whitespace; \s+ represents a non-empty (one or more, +) run of whitespace characters; for more information about the -replace operator, see this answer.

    In PSv3+, you could simplify to:

    (' One,   Two,Three  ' -split ',').Trim()
    

    or use the solution shown in the question.
    Both solutions rely on a v3+ feature called member-access enumeration, discussed below.

    To also weed out empty / all-whitespace elements, append -ne ''


    As for why ('One, Two, Three'.Split(',')).Trim() doesn't work in PSv2: The .Split() method returns an array of tokens, and invoking the .Trim() method on that array - as opposed to its elements - isn't supported in PSv2.

    In PSv3+, the .Trim() method call is implicitly "forwarded" to the elements of the resulting array, resulting in the desired trimming of the individual tokens - this feature is called member-access enumeration.