The following code uses PDFSharp to split out pages of pdf documents into pages that are smaller than A4 and pages that are larger than A3:
''' <summary>
''' Process the list of pdfs
''' </summary>
Public Sub ProcessPdfs()
Dim tempPath As String
' Code omitted
' Generate a temporary path in case pdfs need to be saved
If String.IsNullOrEmpty(Me.tempFolder) OrElse Not Directory.Exists(Me.tempFolder) Then
tempFolder = Path.GetTempPath()
End If
tempPath = Path.Combine(Me.tempFolder, Path.GetRandomFileName + ".pdf")
' Loop through the pages of the pdfs and process each page in turn. Processing involves
' determining the size of the page, then shrinking, adding the footer and then adding to
' the appropriate output pdf
For Each referenceNumber As String In Me.Pdfs.Keys
For Each pdf As PdfDocument In Me.Pdfs(referenceNumber)
' Save the pdf to disk for PDFSharp to be able to read it properly
If String.IsNullOrEmpty(pdf.FullPath) Then
pdf.Save(tempPath)
pdf = PdfReader.Open(tempPath)
End If
For Each page As PdfPage In pdf.Pages
' Code omitted
Select Case pageArea
Case Is <= a4PageArea
Call AddPage(referenceNumber, pdf, page, PageSize.A4)
Case Else
Call AddPage(referenceNumber, pdf, page, PageSize.A3)
End Select
Next
Next
Next
' Code omitted
' Delete temporary pdfs if there are any
If File.Exists(tempPath) Then
File.Delete(tempPath)
End If
End Sub
''' <summary>
''' Add the specified page to the specified output document
''' </summary>
''' <returns>The page which was added to the output pdf</returns>
Private Function AddPage(ByVal ReferenceNumber As String, ByVal ParentPdf As PdfDocument, ByVal ParentPdfPage As PdfPage, ByVal PageSize As PageSize) As PdfPage
' Code omitted
' Copy the specified page onto thew newly created page
Using parentForm As XPdfForm = XPdfForm.FromFile(ParentPdf.FullPath)
parentForm.PageIndex = ParentPdf.Pages.Cast(Of PdfPage)().ToList().IndexOf(ParentPdfPage)
scaleFactor = 1
' Create PdfSharp graphics object with which to write onto the page
Using graphics As XGraphics = XGraphics.FromPdfPage(outputPdfPage)
graphics.SmoothingMode = XSmoothingMode.HighQuality
' Code omitted
' Draw the page
Call graphics.DrawImage(parentForm, targetRect)
End Using
End Using
Return outputPdfPage
End Function
What this does is take a pdf, read esch page and then scale it so that it fits the size of the page onto which it is to be printed.
PDFSharp has trouble opening documents which were created in Adobe v6, so I use iTextSharp to rebuild the pdf in a version that PDFSharp can open. These PDFs are rebuilt in memory, and for some reason they need to be written to disk for the PDFSharp to process them correcly.
In ProcessPdfs()
I check if the pdf has a physical path and if not I save it at a temp location.
The problem I found is that AddPage()
seems to continuously work with the same pdf. I checked the temporary pdf files created on disk and they are correct, i.e. different each time.
But the file loaded in the first using statement by XPdfForm.FromFile(ParentPdf.FullPath)
never changes.
It's as if the code realises that the file path does not change and so decides not to reload the file.
I thought that using a using
statement would ensure that the variable would be disposed of at the end and therefore the file would be reloaded anew every time. Am I misunderstanding? Or what is happening here?
Incidentally I worked around this by saving each pdf file under a different file name. Which is why I think that the variable from the using block is being reused every time based on the file name...
The XPdfForm caches the documents internally - and the filename is the key. If you re-use the filename for a new document, the old, cached document will be used.
The cache is thread-local.
So it's not a bug, it's a feature.
It should be possible to use streams instead of files.