Search code examples
javapdfocrpdf-parsing

pdf parse to text in java


I have an Arabic PDF, and I want to parse it into text document using Java. I have tried many times, and the English words parse successfully but the Arabic words don't.

Can anyone recommend a solution that will convert the Arabic words properly as well?


Solution

  • I think you can use iText for pdf manipulation using Java. It supports Arabic too.