Search code examples
vb.netpdfmergedirectoryenumeration

VB.net Merge PDF's in a directory that share a similar Filename


I currently have a directory where i split a PDF that had multiple headers/barcodes into the following categories:

File# Header Sheet# so it looks like this:

ZTEST01 Cover Sheet 1

ZTEST01 Cover Sheet 2

ZTEST01 Complaint 3

ZTEST01 Complaint 4

ZTEST01 Exhibit 5

ZTEST01 Exhibit 6

ZTEST01 Summons 8

ZTEST01 Summons 9

My goal is to have the code iterate through this directory and merge all the files that have the same header name in the middle together:

ZTEST01 Cover Sheet 1 + ZTEST01 Cover Sheet 2 = ZTEST01 Cover Sheet

This is the following code i have (i was going back and forth with PDFsharp and Bytescout hence i'm leaving the imports alone for now till i figure out what works best):

Imports Bytescout.PDFExtractor
Imports System.Diagnostics
Imports System
Imports System.Collections.Generic
Imports System.IO
Imports System.IO.Path
Imports System.Linq
Imports System.Text
Imports System.Threading.Tasks
Imports PdfSharp.Pdf
Imports PdfSharp.Pdf.IO

Module Module2




    Public Sub Main(ByVal args As String())
        Dim Dir As String = "G:\Word\Department Folders\Pre-Suit\Drafts-IL\2-IL_AttyReview\2018-09\Reviewed\unmerged"
        Dim name As String = "Complaint"

        Dim supportedfiles As New List(Of String)()
        For Each files As String In Directory.GetFiles(Dir, "*.pdf")
            Dim filename As String = GetFileName(files).ToLower()

            If filename Like name Then
                supportedfiles.Add(files)
            End If
        Next files



        Dim outputPdfDocument As PdfDocument = New PdfDocument()


        For Each files As String In supportedfiles
            Merge(outputPdfDocument, files)

            Dim Path As String = IO.Path.GetFileNameWithoutExtension(files)

            outputPdfDocument.Save(Dir & "\Merge\" & Path & "Complaint" & ".pdf")
        Next

        Console.ReadKey()


    End Sub

    Public Sub Merge(ByVal outputPDFDocument As PdfDocument, ByVal pdfFile As String)
            Dim inputPDFDocument As PdfDocument = PdfReader.Open(pdfFile, PdfDocumentOpenMode.Import)
            outputPDFDocument.Version = inputPDFDocument.Version

            For Each page As PdfPage In inputPDFDocument.Pages
                outputPDFDocument.AddPage(page)

            Next

        End Sub


End Module

I tried using the filename like "Complaint" for now to see if it works but so far it just brings up a blank cmd prompt.

I'd like to do this for

"Cover Sheet"

"Complaint"

"Exhibit"

and "Summons"

Any suggestions would be greatly appreciated.


Solution

  • Solution:

        Imports System.IO
        Imports System.IO.Path
        Imports PdfSharp.Pdf
        Imports PdfSharp.Pdf.IO
    
        Module Module1
            Private inputdir As String = "G:\Word\Department Folders\Pre-Suit\Drafts-IL\2-IL_AttyReview\2018-09\Reviewed\unmerged\
    
    "
    
    
        Public Sub Main()
    
            MergeFiles("Cover Sheet", inputdir)
            MergeFiles("Complaint", inputdir)
            MergeFiles("Exhibit", inputdir)
            MergeFiles("Military", inputdir)
            MergeFiles("Summons", inputdir)
        End Sub
    
        Public Sub MergeFiles(ByVal name As String, inputdir As String)
            Dim OutputFile As String
            Dim OutputDir As String = inputdir & "\Merge\"
            Dim OutputDocument As PdfDocument
    
            If Not Directory.Exists(OutputDir) Then Directory.CreateDirectory(OutputDir)
    
            For Each files As String In Directory.GetFiles(inputdir, "*" & name & "*.pdf")
                OutputFile = GetFileNameWithoutExtension(files).Substring(0, 7) & " " & name & ".pdf"
    
                If File.Exists(OutputDir & OutputFile) Then
                    OutputDocument = PdfReader.Open(OutputDir & OutputFile)
                Else
                    OutputDocument = New PdfDocument()
                End If
                Console.WriteLine("Merging: {0}...", GetFileName(files))
                Using InputDocument As PdfDocument = PdfReader.Open(files, PdfDocumentOpenMode.Import)
                    For Each page As PdfPage In InputDocument.Pages
                        OutputDocument.AddPage(page)
                    Next
                End Using
    
                OutputDocument.Save(OutputDir & OutputFile)
                OutputDocument.Dispose()
            Next
    
        End Sub
    End Module