Would someone be able to advise me on this, please?
The source file (filename sourcetrans.txt
) content includes path and row number eg:
C:\temp\TRANSFile1.txt:1001:DTRANS 1.111111111 12345667889 debit product1
C:\temp\TRANSFile1.txt:20002:DTRANS 2.222222222 23143453456 credit product2
C:\temp\TRANSFile1.txt:300:DTRANS 3.333333333 23443655678 debit product3
I'm trying to extract the debit
rows only and redirect output to another file but I also wish to remove the file path and row number from the start of each row ie remove this string:
so the desired output being:
DTRANS 1.111111111 12345667889 debit product1
DTRANS 3.333333333 23443655678 debit product3
It was going quite well with the below command
Get-Content "C:\temp\sourcetrans.txt" |
where {$_ -like "C:\temp\TRANSFile1.txt:*debit*"}
).Replace('C:\temp\TRANSFile1.txt:','') |
Out-File -FilePath C:\temp\SSFinal.txt -append
and the output file SSfinal.txt
1001:DTRANS 1.111111111 12345667889 debit product1
300:DTRANS 3.333333333 23443655678 debit product3
However, the output still contains the row number.
I thought it would be a simple case of using *
wildcard to filter out the row number eg:
Get-Content "C:\temp\sourcetrans.txt" |
where {$_ -like "C:\temp\TRANSFile1.txt:*debit*"}
).Replace('C:\temp\TRANSFile1.txt:*:','') |
Out-File -FilePath C:\temp\SSFinal.txt -append
However this doesn't work, including the *
returns the full string again including the filepath and row number. Any advice greatly appreciated.
DTRANS 1.111111111 12345667889 debit product1
DTRANS 3.333333333 23443655678 debit product3
The .Replace()
.NET method only supports literal replacements (it doesn't support wildcard expressions or regexes).
By contrast, PowerShell's -replace
operator is regex-based, so using it instead of .Replace()
is one option (I'm using a single input line as an example; see the next section for a complete solution):
# Remove everything up to and including
# the ":" after the number following the path.
$line -replace '^.:.+?:.+?:'
Another option is to use the -split
# Split the line into at most 4 ":"-separated tokens and extract the last one.
($line -split ':', 4)[-1]
However, you can also use the regex-based -match
operator in your where
) call to both match only the lines of interest and use a capture group ((…)
) to capture only the relevant part of each line of interest, which can then be accessed via the automatic $Matches
variable in a subsequent ForEach-Object
Get-Content C:\temp\sourcetrans.txt |
Where-Object { $_ -match '^.:.+?:.+?:(.*debit.*)$' }
ForEach-Object { $Matches[1] } |
Out-File -FilePath C:\temp\SSFinal.txt -Append
Note: -Append
is only needed if you need to append content to a preexisting file.
For an explanation of the regex and the ability to experiment with it, see this regex101.com page.