Search code examples
powershellpdfcmdmergepdftk

Quickly append (merge) a PDF document onto 50,000 other documents


We have 50,000 documents monthly that need a single different pdf appended to the end of them.

I have a powershell script that loops through the list of 50,000 and calls PDFtk to merge them. But this is taking many many hours even on a machine with high RAM.

The core of the code is this:

foreach ($pdf in $filelist){
  <#Get all variables#>
  ...
  $merge = '"' + $TKPath + '" "' + $FirstPDFPath + '" "' + $SecondPDFPath + '" cat output "' + $OutputPDFPath + '"'
  cmd.exe /c $merge
}

Is this an issue with PDFtk? Or am I causing problems by calling cmd.exe /c inside the loop? Can I just call PDFtk without that somehow? I've never got it to work.


Solution

  • Starting 50K cmd.exe processes will definitely carry some overhead.

    You could attempt to invoke pdftk directly from powershell with the & call operator:

    & $TKPath $FirstPDFPath $SecondPDFPath cat output $OutputPDFPath