Search code examples

Text Extraction library from different file types, PDF ,DOC, DOCX, TXT c#

I'm Building Information Retrieval System that search text in multi files formats, I have Tried EPocalipse IFilter Lirary but it through an exception when trying to read docx files, and I tried Toxy Library it though an exception for doc arabic files, finally I tried TikaOnDotNet Libray but it need java to work and I need to put the system online on hosting that don't have java installed on server


  • What about using such libraries :

    For DOC/DOCX:

    For PDF:

    For TXT: