Search code examples
powershelldata-processingwindows-scripting

How to modify created output using Microsoft PowerShell?


I've created a code in PowerShell to search through files to find lines containing null values '000' and create a new file that replaces these values with a user input. I've included the code below and it does function as intended.

The problem I'm having is that it also prints 'line' and the command is also repeated before and after the modified data (This time using an input of '5u7'). This is demonstrated in the output I've pasted below the code.

I would like to remove these extra lines and also add my own string above the data to show the data has been modified. Hopefully to output something like this:

THIS DATA HAS BEEN MODIFIED

--- aaa --- 5u7 --- ccc --- ddd
--- aaa --- 5u7 --- ccc --- ppp
--- aaa --- 5u7 --- ccc --- ddd
--- aaa --- 5u7 --- ccc --- zzz

Set-ExecutionPolicy AllSigned
#Set Execution Policy

ls -r -Path C:\Users\Desktop\Data\Raw_data | sls '000' | select Line | 
Out-File C:\Users\Desktop\Data\Raw_data\Processed_data.txt

#Extract lines with null values and create a new file for this data

$file = @(get-item -Path 
"C:\Users\Desktop\Data\Raw_data\Processed_data.txt")

#Defines new file for for loop

$replacementStr = Read-Host -Prompt 'Input data in xxx format'

#Requests input from user to replace null values

$confirmation = Read-Host "Is this correct? (y/n):"

#Confirmation that format is correct

if ($confirmation -eq'n'){

    Remove-Item C:\Users\Desktop\Data\Raw_data\Processed_data.txt
    # exit 
}

#If format is incorrect delete created file and end the script

elseif ($confirmation -eq'y') {
    for ($file) 
        { 
            (Get-Content $file) | 
                    Foreach-object { $_ -replace '000' , $replacementStr}| 
                Set-Content $file 
                Write-Host Processed $file
                break
        }
}

#Search through the file for the 000 values and replace with the new 
user input string
Line                                                                                                                                          
----                                                                                                                                          
ls -r -Path C:\Users\Desktop\Data\Raw_data | sls '5u7' | select Line | Out-File C:\Users\Desktop\Data\Raw_data\Processed_data.txt
               foreach-object { $_ -replace '5u7' , $replacementStr   } |                                                                    
--- aaa --- 5u7 --- ccc --- ddd                                                                                                               
--- aaa --- 5u7 --- ccc --- ppp                                                                                                               
--- aaa --- 5u7 --- ccc --- ddd                                                                                                               
--- aaa --- 5u7 --- ccc --- zzz                                                                                                               
ls -r -Path C:\Users\Desktop\Data\Raw_data | sls '5u7' | select Line | Out-File C:\Users\Desktop\Data\Raw_data\Processed_data.txt   

Solution

  • the command

    ls -r -Path C:\Users\Desktop\Data\Raw_data | sls '000' | select Line
    

    does select the column/property "Line" and also will return an object with that very property. That's why it's printed out. To avoid this behaviour you can use the parameter -ExpandProperty like so to return only the contents of the selection:

    ls -r -Path C:\Users\Desktop\Data\Raw_data | sls '000' | select -ExpandProperty Line
    

    To add a custom property name you can set a new property and add the contents of the property "Line" to it.

    ls -r -Path C:\Users\Desktop\Data\Raw_data | sls '000' | select -Property @{Name = 'THIS DATA HAS BEEN MODIFIED'; Expression = {$_.Line}}
    

    Please note that it is not best practice to use aliases (like "ls" - use "Get-ChildItem" instead as well as "sls" - "Select-String") and also avoid shortening parameters ("-r" should be "-Recurse"). This makes it more robust and better readable.