Search code examples

Is iText7 available in VB.Net or only C#

I want to extract text fields content from pdf files which have text fields that I need to bring into my Winforms project. Searching I found reference to iTextSharp but then saw that it is replaced with iText7 but everything I read refers only to it being used in C#. My winforms project is vb. Any pointers as to what would be my best option to achieve getting that data into my project is much appreciated


  • To extract text from a PDF file using itext7, try the following:

    Pre-requisite: Download/install NuGet package itext7

    Add the following Imports statements:

    Imports iText.Kernel.Pdf
    Imports iText.Kernel.Pdf.Canvas.Parser.Listener
    Imports iText.Kernel.Pdf.Canvas.Parser.PdfTextExtractor


    Public Function GetTextFromPdf(filename As String) As String
        Dim sb As System.Text.StringBuilder = New System.Text.StringBuilder()
        Using doc As PdfDocument = New PdfDocument(New PdfReader(filename))
            'Dim strategy As LocationTextExtractionStrategy = New LocationTextExtractionStrategy()
            For i As Integer = 1 To doc.GetNumberOfPages() Step 1
                Dim page = doc.GetPage(i)
                'Dim text = iText.Kernel.Pdf.Canvas.Parser.PdfTextExtractor.GetTextFromPage(page, strategy)
                Dim text = GetTextFromPage(page)
        End Using
        Return sb.ToString()
    End Function

    The code for GetTextFromPdf is adapted from here.


    The code below shows how to read the field names and field values from an AcroForm in a Pdf document:

    Add the following Imports statements:

    Imports iText.Forms
    Imports iText.Kernel.Pdf


    Public Function GetTextFromPdfFields(filename As String) As String
        Dim sb As System.Text.StringBuilder = New System.Text.StringBuilder()
        'create new instance
        Using doc As PdfDocument = New PdfDocument(New PdfReader(filename))
            'get AcroForm from document
            Dim form As PdfAcroForm = PdfAcroForm.GetAcroForm(doc, True)
            'get form fields
            Dim fieldDict As IDictionary(Of String, Fields.PdfFormField) = form.GetFormFields()
            'loop through form fields
            For Each kvp As KeyValuePair(Of String, Fields.PdfFormField) In fieldDict
                Dim type As PdfName = form.GetField(kvp.Key).GetFormType()
                Dim fieldName As PdfString = form.GetField(kvp.Key).GetFieldName()
                Dim fieldValue As String = form.GetField(kvp.Key).GetValueAsString()
                If fieldName IsNot Nothing Then
                    'append data to instance of StringBuilder
                    sb.AppendLine("Type: " & type.ToString() & " FieldName: " & fieldName.ToString() & " Value: " & fieldValue)
                End If
        End Using
        Return sb.ToString()
    End Function

    **Note: The code for GetTextFromPdfFields is adapted from here.