Is there a proper library which I can use to convert PDF to HTML or some other format that can be converted to HTML easily?
I searched similar questions, but to no luck.
I want to be able to extract text from PDF's, possibly images. I'm not looking to embed the PDF inside the HTML.
Like I mentioned in the comment above, it is definitely possible to convert pdf to html using the tool Able2Extract7 which can be downloaded from here
I have been using this tool for almost 2 years now and I am pretty happy with it. This tool lets you convert PDF to Word, Excel, PowerPoint, Publisher, HTML, OO etc. See screenshot
Imp Note: This tool is not a freeware.
HTH