Search code examples
powershellrename-item-cmdlet

Powershell rename text files based on strings in text --> More concise way for script?


I am trying to rename bankstatements in MT940-format using the account number and the statement date.

The statements contain the following (example):

    :20:
    :25:MHCBNL2AXXX/**0364525123**
    :28C:27/
    :60F:C200207EUR100000,00
    :61:2012311231D0000,1NMSCTOPF1234567890SDD TOPF1234567890
    :86:FR1234567890ARVAL FRANCE
    :62F:C**200207**EUR100000,00

I have written the following powershell script by combining some examples but it seems quite long for the purpose. Question: Is there a concise way to write this script?

 $files = Get-ChildItem "C:\Dropbox\Temp\Gerard\test\*" -Include *.txt, *.ged
 for ($i=0; $i -lt $files.Count; $i++) 
 { 
   $filename = $files[$i].FullName        
  
  #Rename the file based on strings in the file
   $Account =  (Get-Content -Raw -Path $fileName) 
   $Account -match ":25:.+(\d{10})" 
   $Account = $matches[1]

   $StatementDate  =  (Get-Content -Raw -Path $fileName) 
   $StatementDate -match ":62F:C(?<content>.*)EUR"
   $StatementDate  = $matches['content']

   $file=Get-Item $filename
   $file.Basename 
   $extension=$file.Extension
   
   Rename-Item -Path $filename -NewName "$StatementDate-$Account$extension"
}


Solution

  • You could have achieved similar with the below:

    $Files = Get-ChildItem '/Users/acc/Downloads/bank/*' -Include '*.txt', '*.ged'
    foreach ($File in $Files) {
        $Content = Get-Content -Path $File -Raw
    
        $Account = [Regex]::Match($Content, ':25:.+\*{2}(?<Account>\d{10})\*{2}').Groups['Account'].Value
        $StatementDate = [Regex]::Match($Content, ':62F:C\*{2}(?<StatementDate>\d+)\*{2}EUR').Groups['StatementDate'].Value
    
        Rename-Item -Path $File -NewName ('{0}-{1}{2}' -f $StatementDate, $Account, $File.Extension)
    }
    
    • By using the foreach loop to iterate over objects in a collection, instead of a for (in-range) loop, you gain some aesthetic benefits like being able to easily access object's properties cleanly in the collection.
      • For example, instead of getting an object instance of your file by calling Get-Item $filename to only get its extension, it is simplified by using the foreach loop and the current iterable is still an object of System.IO.FileSystemInfo.FileInfo. Therefore we can get its extension by accessing the current iterable $File.extension.
    • You were reading from a file multiple times with Get-Content where you only needed to do this once for each file.
    • In my opinion, using the .NET Match() method of the Regex class is cleaner than using the -match operator, but this is personal preference.
      • I did try to use the Matches() method so I could pass both regex patterns (split on a pipe |) in one call, but for some reason, in both groups returned, not both patterns were matched; one group contained a match for 'Account' whereas it did not for 'StatementDate', and vice versa on the other group.