I currently have a directory where i split a PDF that had multiple headers/barcodes into the following categories:
File# Header Sheet# so it looks like this:
ZTEST01 Cover Sheet 1
ZTEST01 Cover Sheet 2
ZTEST01 Complaint 3
ZTEST01 Complaint 4
ZTEST01 Exhibit 5
ZTEST01 Exhibit 6
ZTEST01 Summons 8
ZTEST01 Summons 9
My goal is to have the code iterate through this directory and merge all the files that have the same header name in the middle together:
ZTEST01 Cover Sheet 1 + ZTEST01 Cover Sheet 2 = ZTEST01 Cover Sheet
This is the following code i have (i was going back and forth with PDFsharp and Bytescout hence i'm leaving the imports alone for now till i figure out what works best):
Imports Bytescout.PDFExtractor
Imports System.Diagnostics
Imports System
Imports System.Collections.Generic
Imports System.IO
Imports System.IO.Path
Imports System.Linq
Imports System.Text
Imports System.Threading.Tasks
Imports PdfSharp.Pdf
Imports PdfSharp.Pdf.IO
Module Module2
Public Sub Main(ByVal args As String())
Dim Dir As String = "G:\Word\Department Folders\Pre-Suit\Drafts-IL\2-IL_AttyReview\2018-09\Reviewed\unmerged"
Dim name As String = "Complaint"
Dim supportedfiles As New List(Of String)()
For Each files As String In Directory.GetFiles(Dir, "*.pdf")
Dim filename As String = GetFileName(files).ToLower()
If filename Like name Then
supportedfiles.Add(files)
End If
Next files
Dim outputPdfDocument As PdfDocument = New PdfDocument()
For Each files As String In supportedfiles
Merge(outputPdfDocument, files)
Dim Path As String = IO.Path.GetFileNameWithoutExtension(files)
outputPdfDocument.Save(Dir & "\Merge\" & Path & "Complaint" & ".pdf")
Next
Console.ReadKey()
End Sub
Public Sub Merge(ByVal outputPDFDocument As PdfDocument, ByVal pdfFile As String)
Dim inputPDFDocument As PdfDocument = PdfReader.Open(pdfFile, PdfDocumentOpenMode.Import)
outputPDFDocument.Version = inputPDFDocument.Version
For Each page As PdfPage In inputPDFDocument.Pages
outputPDFDocument.AddPage(page)
Next
End Sub
End Module
I tried using the filename like "Complaint" for now to see if it works but so far it just brings up a blank cmd prompt.
I'd like to do this for
"Cover Sheet"
"Complaint"
"Exhibit"
and "Summons"
Any suggestions would be greatly appreciated.
Solution:
Imports System.IO
Imports System.IO.Path
Imports PdfSharp.Pdf
Imports PdfSharp.Pdf.IO
Module Module1
Private inputdir As String = "G:\Word\Department Folders\Pre-Suit\Drafts-IL\2-IL_AttyReview\2018-09\Reviewed\unmerged\
"
Public Sub Main()
MergeFiles("Cover Sheet", inputdir)
MergeFiles("Complaint", inputdir)
MergeFiles("Exhibit", inputdir)
MergeFiles("Military", inputdir)
MergeFiles("Summons", inputdir)
End Sub
Public Sub MergeFiles(ByVal name As String, inputdir As String)
Dim OutputFile As String
Dim OutputDir As String = inputdir & "\Merge\"
Dim OutputDocument As PdfDocument
If Not Directory.Exists(OutputDir) Then Directory.CreateDirectory(OutputDir)
For Each files As String In Directory.GetFiles(inputdir, "*" & name & "*.pdf")
OutputFile = GetFileNameWithoutExtension(files).Substring(0, 7) & " " & name & ".pdf"
If File.Exists(OutputDir & OutputFile) Then
OutputDocument = PdfReader.Open(OutputDir & OutputFile)
Else
OutputDocument = New PdfDocument()
End If
Console.WriteLine("Merging: {0}...", GetFileName(files))
Using InputDocument As PdfDocument = PdfReader.Open(files, PdfDocumentOpenMode.Import)
For Each page As PdfPage In InputDocument.Pages
OutputDocument.AddPage(page)
Next
End Using
OutputDocument.Save(OutputDir & OutputFile)
OutputDocument.Dispose()
Next
End Sub
End Module