Search code examples
eclipsedllpackagetesseracttess4j

How to add DLLs while building a JAR - Eclipse


I have made an OCR Application that would convert image files to Doc Files, using Tesseract as its OCR Engine. I used the Tess4j JNA Wrappers for this. While making the application i put the dll files and the language data(tessdata) in the bin folder of the project, and the application worked fine. Now when i build the project the dll files and tessdata are not included in the JAR, and thus the program isn't working. I have tried two ways of export

**1. Package Required Libraries into Generated JAR **

I added the DLL files & the Tessdata in the same directory as the JAR file. But it didnt run.

https://i.sstatic.net/eb3BY.png

It gave me the following Error

F:\New folder>java -jar w.jar scan.jpg
Error opening data file bin//tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent d
irectory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Exception in thread "main" java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoa
der.java:58)
Caused by: java.util.ServiceConfigurationError: javax.imageio.spi.ImageInputStre
amSpi: Provider com.sun.media.imageioimpl.stream.ChannelImageInputStreamSpi coul
d not be instantiated: java.lang.IllegalArgumentException: vendorName == null!
        at java.util.ServiceLoader.fail(ServiceLoader.java:224)
        at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
        at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
        at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
        at javax.imageio.spi.IIORegistry.registerApplicationClasspathSpis(IIOReg
istry.java:210)
        at javax.imageio.spi.IIORegistry.<init>(IIORegistry.java:138)
        at javax.imageio.spi.IIORegistry.getDefaultInstance(IIORegistry.java:159
)
        at javax.imageio.ImageIO.<clinit>(ImageIO.java:65)
        at net.sourceforge.vietocr.ImageIOHelper.getImageByteBuffer(Unknown Sour
ce)
        at net.sourceforge.tess4j.Tesseract.setImage(Unknown Source)
        at net.sourceforge.tess4j.Tesseract.doOCR(Unknown Source)
        at net.sourceforge.tess4j.Tesseract.doOCR(Unknown Source)
        at net.sourceforge.tess4j.Tesseract.doOCR(Unknown Source)
        at com.shaurya.back.OCR.TesseractEngine.getResult(TesseractEngine.java:2
0)
        at com.shaurya.back.ImageToDocument.identify(ImageToDocument.java:117)
        at com.shaurya.back.ImageToDocument.transform(ImageToDocument.java:53)
        at com.shaurya.front.runnow.main(runnow.java:27)
        ... 5 more
Caused by: java.lang.IllegalArgumentException: vendorName == null!
        at javax.imageio.spi.IIOServiceProvider.<init>(IIOServiceProvider.java:7
6)
        at javax.imageio.spi.ImageInputStreamSpi.<init>(ImageInputStreamSpi.java
:90)
        at com.sun.media.imageioimpl.stream.ChannelImageInputStreamSpi.<init>(Ch
annelImageInputStreamSpi.java:63)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstruct
orAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingC
onstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at java.lang.Class.newInstance(Class.java:374)
        at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
        ... 19 more

F:\New folder>

**2. Copy Required Libraries in Sub-Folder next to the generated JAR **

Here too i copied the dll files and the tessdata folder in the same directory as the JAR Files.(If i copy it inside the subfolder containing libraries, it couldn't even find the DLL files.)

https://i.sstatic.net/DSR1c.png

The error given is:

F:\New folder\kol>java -jar runn.jar scan.jpg
Error opening data file bin//tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent d
irectory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Exception in thread "main" java.lang.Error: Invalid memory access
        at com.sun.jna.Native.invokePointer(Native Method)
        at com.sun.jna.Function.invokePointer(Function.java:470)
        at com.sun.jna.Function.invoke(Function.java:404)
        at com.sun.jna.Function.invoke(Function.java:315)
        at com.sun.jna.Library$Handler.invoke(Library.java:212)
        at com.sun.proxy.$Proxy0.TessBaseAPIGetUTF8Text(Unknown Source)
        at net.sourceforge.tess4j.Tesseract.getOCRText(Unknown Source)
        at net.sourceforge.tess4j.Tesseract.doOCR(Unknown Source)
        at net.sourceforge.tess4j.Tesseract.doOCR(Unknown Source)
        at net.sourceforge.tess4j.Tesseract.doOCR(Unknown Source)
        at com.shaurya.back.OCR.TesseractEngine.getResult(TesseractEngine.java:2
0)
        at com.shaurya.back.ImageToDocument.identify(ImageToDocument.java:117)
        at com.shaurya.back.ImageToDocument.transform(ImageToDocument.java:53)
        at com.shaurya.front.runnow.main(runnow.java:27)

F:\New folder\kol>

So the Main problem it seems is that it isnt abke to find the Tessdata folder, though the dll are found. Another thing i was curious about is why is there a bit of change in Exception Stack in both cases(This seems unusual since both have the same code and are facing the same problem, just that the packaging is done a bit different.)

EDIT 1:

It Doesn't work even if i remove the dlls and tessdata from the bin to another folder and add it as an external class folder in the Java Build Path -> Libraries. If i do that then i get the same error that tessdata isnt found(In the application itself).

EDIT 2:

instance.setDatapath("bin//tessdata");

This is what is set as my datapath. Maybe changing this in someway might fix the error?

And Sorry if there has been some formatting problems in the post. The StackOverflow Ask a question isnt showing any preview or doesnt have the formatting buttons right now. Will Edit it if there are problems later when it does show :)

-Shaurya


Solution

  • It looks that it could not locate the tessdata folder under bin. Do you have it under there? The double forward slashes also look suspect; try to change it to "bin/".