Search code examples
powershellregexp-replace

powershell -replace only write file on changes


I'm doing some multiple regEX replacements in powershell on a large number of files and would like to only write the file if any replacements were actually made.

For example if I do:

($_ | Get-Content-Raw) -Replace 'MAKEUPS', 'Makeup' -Replace '_MAKEUP', 'Makeup' -Replace 'Make up', 'Makeup' -Replace 'Make-up', 'Makeup' -Replace '"SELF:/', '"' | 
  Out-File $_.FullName -encoding ASCII

I only want to write the file if it found anything to replace. Is this possible, maybe with a count or boolean operation?

I did think maybe to check the length of the string before and after but was hoping for a more elegant solution, so I thought I'd ask the experts!


Solution

  • You can take advantage of the fact that PowerShell's -replace operator passes the input string through as-is if no replacements were performed:

    # <# some Get-ChildItem command #> ... | ForEach-Object {
    
      # Read the input file in full, as a single string.
      $originalContent = $_ | Get-Content -Raw
    
      # *Potentially* perform replacements, depending on whether the search patterns are found.
      $potentiallyModifiedContent =
        $originalContent -replace 'MAKEUPS', 'Makeup' -replace '_MAKEUP', 'Makeup' -replace 'Make up', 'Makeup' -replace 'Make-up', 'Makeup' -replace '"SELF:/', '"'
    
      # Save, but only if modifications were made.
      if (-not [object]::ReferenceEquals($originalContent, $potentiallyModifiedConent)) {
        Set-Content -NoNewLine -Encoding Ascii -LiteralPath $_.FullName -Value $potentiallyModifiedConent
      }
    
    # }
    
    • [object]::ReferenceEquals() tests for reference equality, i.e. whether the two strings represent the exact same string instance, which makes the comparison very efficient (no need to look at the content of the strings).

    • Set-Content rather than Out-File is used to write the output file, which is preferable for performance reasons with input that is made up of strings already.

      • -NoNewLine is needed to prevent a trailing newline from getting appended to the output file.