Search code examples
powershellsortingselect-into-outfile

Write a script to sort words in alphabet order from specific file and put them into 26 text files named A.txt, B.txt, and so on up to Z.txt


I need to sort words alphabetically from a specific file and put them into 26 text files named A.txt, B.txt and so on up to Z.txt.

$Content = Get-Content ".\\1.txt"
$Content = ($Content.Split(" .,:;?!/()\[\]{}-\`\`\`"")|sort)
$linecount = 0
$filenumber = 0
$destPath = "C:\\test"
$destFileSize = 26

$Content |Group {$_.Substring(0,1).ToUpper()} |ForEach-Object {
$path = Join-Path $destPath $_.Name
$\_.Group |Set-Content $path
}

$Content | % {
Add-Content $destPath$filenumber.txt "$\_"
$linecount++
If ($linecount -eq $destFileSize) {
$filenumber++  
$linecount = 0
}
}

Solution

  • You could do something like this, but this also could mean some files may not be written if there are no words beginning with a certain letter found in the file:

    $destPath = "D:\test"
    (Get-Content -Path 'D:\Test\Lorem.txt' -Raw) -split '\W' -ne '' |
    Group-Object {$_.Substring(0,1).ToUpperInvariant()} | 
    Where-Object {$_.Name -cmatch '[A-Z]'} | ForEach-Object {
        $_.Group | Sort-Object | Set-Content -Path (Join-Path -Path $destPath -ChildPath ('{0}.txt' -f $_.Name))
    }
    

    If you always want exactly 26 files even if some may contain nothing, use this instead

    $destPath = "D:\test"
    $wordGroups = (Get-Content -Path 'D:\Test\Lorem.txt' -Raw) -split '\W' -ne '' |
                   Group-Object {$_.Substring(0,1).ToUpperInvariant()}
    foreach ($char in ('ABCDEFGHIJKLMNOPQRSTUVWXYZ' -split '(.)' -ne '')) {
        $outFile = Join-Path -Path $destPath -ChildPath ('{0}.txt' -f $char)
        $group = $wordGroups | Where-Object { $_.Name -eq $char }
        if ($group) { $group.Group | Sort-Object | Set-Content -Path $outFile }  # output the found words
        else { $null | Set-Content -Path $outFile }                              # or create an empty file
    }
    

    The Where-Object {$_.Name -cmatch '[A-Z]'} clause makes it ignore words starting with some other character than A to Z