Search code examples
powershelltext-parsing

Powershell: Combine next line with current line


I am looking for desired output as shown below. The main concept is to join the next line with current line and result into one line if it ends with a character.

Thank you

Current Output Desired Output

$file = 'textfile.txt'
$reportObject = @() 

foreach ($line in $file) {

  $content = [IO.File]::ReadAllLines($file)

  for ($i = 0; $i -lt $content.count; $i++) {

    $line = $content[$i]

    if ($line.StartsWith("Users")) {

      $a = 0

      while ($content[$i + $a].EndsWith("\")) {                                                #"

        $reportObject += $content[$i + $a]

        $a++

      }

      $reportObject += $content[$i + $a]

    }

  }
  $reportObject
}

Solution

  • # Use
    #   $reportObject = (Get-Content ...
    # to assign the resulting array of lines to your variable.
    (Get-Content -Raw textfile.txt) -replace '\\\r?\n' -split '\r?\n'
    
    • Get-Content -Raw textfile.txt reads the entire file into a single, multi-line strings, due to use of Get-Content's -Raw switch.

    • -replace '\\\r?\n' replaces a \ char. (escape as \\) followed by a newline (\r?\n)[1] with the (implied) empty string, and therefore effectively removes the matched sequence.

      • For more information about the regex-based -replace operator, see this answer.
    • -split '\r?\n' splits the resulting string back into an array of individual lines.

      • For more information about the regex-based -split operator, see about_Split.

    Note: If the input file had a trailing newline, the above array will have an additional empty element at the end:

    • If there are no other empty lines in the input (that you need to preserve), you can simply append -ne '' to the command above to remove all empty-string elements.

    • To only remove the last element, the simplest (albeit not most efficient) solution is to append
      | Select-Object -SkipLast 1 - see Select-Object's documentation.


    [1] This regex is a cross-platform way of matching a newline: it matches both Windows-style CRLF sequences (\r\n) and Unix-style LF-only (\n) newlines.