Search code examples
pythonlinuxtabula

How to specify which directory to get files from in Tabula Java


I have this code in python which I use to open with the subprocess module and proceed to get the data from there but I can't figure out how to OCR a file from a different directory. I've tried putting the complete file path to the directory where the filename should be in the code but it doesn't seem to do the trick. How can I specify which directory to get the files from in Tabula?

var = ['java', '-jar', 'tabula-0.9.0-jar-with-dependencies.jar','-p', '1', '-a', '35, 0, 800, 800','-c', '25, 55, 85, 115, 145, 185, 339, 363, 530', file]

Solution

  • Specifying the full path to the PDF document should be enough. Additionally, you might consider using tabula-py, a Python wrapper for tabula-java.