Currently i am unable to load original pdf document using GemBox. it gives me below error in image. and I am using Acrobat 9.
I have tried using 8/16/2018 fixes too. Any suggestion will be highly appreciated.
Basic Code i am using is,
using GemBox.Document;
using System;
namespace Pdf2Text
{
class Program
{
[STAThread]
static void Main(string[] args)
{
ComponentInfo.SetLicense("My-License");
DocumentModel document = null;
document = DocumentModel.Load(@"E:\data\testing\HA021.pdf");
document.Save(@"E:\data\testing\HA021.docx");
}
}
}
EDIT:
In the newer versions of GemBox.Document there is another PDF reader that is intended for high-fidelity tasks, see Convert PDF to Word.
Here is how to use it:
var document = DocumentModel.Load("Sample.pdf",
new PdfLoadOptions() { LoadType = PdfLoadType.HighFidelity });
document.Save("Sample.docx");
ORIGINAL:
The current implementation of PDF reader in GemBox.Document is still in beta and cannot handle this PDF feature, "iref streams" which are cross-reference tables stored in streams.
However, GemBox.Pdf can handle cross-reference streams so as a workaround you could do something like the following:
// Load PDF with GemBox.Pdf.
var pdfDocument = PdfDocument.Load("Sample.pdf");
pdfDocument.SaveOptions.CrossReferenceType = PdfCrossReferenceType.Table;
// Save PDF with GemBox.Pdf.
var pdfStream = new MemoryStream();
pdfDocument.Save(pdfStream);
// Load PDF with GemBox.Document.
var document = DocumentModel.Load(pdfStream, LoadOptions.PdfDefault);
Last regarding the conversion of PDF to DOCX, GemBox.Document's PDF reader is currently intended for extracting text and tables from PDF files, it's not intended for any high fidelity requirement.