Search code examples
c#pdftext-extractionimage-extraction

Converting PDF into workable text using C#


Is there a library that has a class to extract the text from a pdf file in c#.net? I've tried a few but documentation is terrible, so I haven't been able to get it off the ground. Also if it provides a class to extract images that would be a plus. Any suggestions? Thx in advance.

Also I need to be able to implement it into an existing application.


Solution

  • Have you tried PDFKit.NET? It has reasonable docs and some good examples. It is designed for a server environment, so it is a little expensive.

    EDIT Here is an open source library on SourceForge called iTextSharp. It is free for open source projects. I haven't used it, but it looks promising. Here is a tutorial for it that has lots of code examples.