Search code examples
powershellcsvreplacefind

How to use a CSV file as the input for a powershell script redacting log files?


I sometimes need to send log files to a vendor for technical support. Due to security constraints these log filed can't contain identifiable data like hostnames, IP addresses, and so on.

I need a way to find and replace this data in log files quickly and easily, but also being able to work out what was replaced for each redaction.

My plan is to scan the log file for IP addresses, create a lookup CSV of the IP addresses and the redactions, then to use the redactions CSV to do a find/replace on the input file.

I've managed to create the lookup CSV for redactions, but for the life of me I've been unable to work out how I can use that to as an input to find and replace the IP addresses in the actual file.

$ipNum = 1
$ipRedactions = "D:\ps_script\ipRedactions.csv"
#$global:ipPrefix = "Redacted-IP-"
$ipPrefix = "Redacted-IP-"
$ipRegex = "((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?){4}"
$inputFile = "D:\ps_script\ipsample.txt"
$fileContent = Get-Content $inputFile
$outputFile = ($inputFile + "_Redacted")
#$filecontent | % {[Regex]::Replace($_, $ipRegex, {return $global:ipNum += 1}) } | Set-Content $outputFile
$ipMatches = $fileContent | Select-String $ipRegex -AllMatches | ForEach-Object { $_.Matches.Value }

#Create a csv file of IP addresses and Redactions so we can do a find and replace in the source file later
$CSVPathTest = Test-Path $ipRedactions #Check if csv file exists
If ($CSVPathTest) {Remove-Item -Path $ipRedactions -Force} #Delete csv file if it exists
ForEach ($ipMatch in $ipMatches)
{
    $CSVList += @([pscustomobject]@{"IP Address"= $ipMatch; "Redaction"= $ipPrefix + $ipNum})| Export-Csv -Path $ipRedactions -Append -NoTypeInformation
    $ipNum = $ipNum + 1 #Increment the counter
}

#Create a new output file
$CSVPathTest = Test-Path $outputFile #Check if csv file exists
If ($CSVPathTest) {Remove-Item -Path $outputFile -Force} #Delete csv file if it exists

#Use the ipRedactions to find and replace IP addresses in the inputFile and create a redacted copy
????

My simplified test data looks like:

test 127.0.0.1test 
test 10.0.0.1 test 
test 172.28.69.77test 
blah blah blahtest 
test 15.26.32.159 test 
test 15.26.32.1594test 
blah blah serverhostname test

and the lookup csv of redactions looks like:

"IP Address","Redaction"
"127.0.0.1","Redacted-IP-1"
"10.0.0.1","Redacted-IP-2"
"172.28.69.77","Redacted-IP-3"
"15.26.32.159","Redacted-IP-4"
"15.26.32.159","Redacted-IP-5"

My attempt so far is above. Maybe I'm just tired but I haven't even managed to think of a way this might be done so far.


Solution

  • Actually I think I've got it working after a bit more searching the internet for ideas.

    $ipNum = 1
    $ipRedactions = "D:\ps_script\ipRedactions.csv"
    #$global:ipPrefix = "Redacted-IP-"
    $ipPrefix = "Redacted-IP-"
    $ipRegex = "((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?){4}"
    $inputFile = "D:\ps_script\ipsample.txt"
    #$fileContent = Get-Content $inputFile
    $outputFile = ($inputFile + "_Redacted")
    #$filecontent | % {[Regex]::Replace($_, $ipRegex, {return $global:ipNum += 1}) } | Set-Content $outputFile
    $ipMatches = $fileContent | Select-String $ipRegex -AllMatches | ForEach-Object { $_.Matches.Value }
    
    #Create a csv file of IP addresses and Redactions so we can do a find and replace in the source file later
    $CSVPathTest = Test-Path $ipRedactions #Check if csv file exists
    If ($CSVPathTest) {Remove-Item -Path $ipRedactions -Force} #Delete csv file if it exists
    ForEach ($ipMatch in $ipMatches)
    {
        $CSVList += @([pscustomobject]@{"IP_Address"= $ipMatch; "Redaction"= $ipPrefix + $ipNum})| Export-Csv -Path $ipRedactions -Append -NoTypeInformation
        $ipNum = $ipNum + 1 #Increment the counter
    }
    
    #Create a new output file
    $CSVPathTest = Test-Path $outputFile #Check if csv file exists
    If ($CSVPathTest) {Remove-Item -Path $outputFile -Force} #Delete csv file if it exists
    
    #Use the ipRedactions to find and replace IP addresses in the inputFile
    $redactions = Import-Csv $ipRedactions 
    $fileContent = Get-Content $inputFile 
    foreach ($row in $redactions) 
    { $field1 = $row.IP_Address 
    $field2 = $row.Redaction
    $fileContent = $fileContent | Foreach-Object { $_ -replace $field1,$field2} } 
    $fileContent | Out-File $outputFile