I've been working on highlighting pdf using PDFClown and mostly its working fine however in few cases its giving the exception as provided in the below stacktrace :
Exception in thread "main" java.lang.IllegalArgumentException: Comparison method violates its general contract!
at java.util.TimSort.mergeLo(Unknown Source)
at java.util.TimSort.mergeAt(Unknown Source)
at java.util.TimSort.mergeCollapse(Unknown Source)
at java.util.TimSort.sort(Unknown Source)
at java.util.TimSort.sort(Unknown Source)
at java.util.Arrays.sort(Unknown Source)
at java.util.Collections.sort(Unknown Source)
at org.pdfclown.tools.TextExtractor.sort(TextExtractor.java:633)
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:284)
at org.pdfclown.samples.cli.TextHighlightSample.run(TextHighlightSample.java:60)
at com.dhawan.poc.Highlight.main(Highlight.java:9)
Any idea how can I resolve this ?
Which version of PDFClown do you use? Your stack trace does not match the current code at http://svn.code.sf.net/p/clown/code/trunk/java/pdfclown.lib but instead contains the following comparison used for sorting:
public int compare(
ITextString textString1,
ITextString textString2
)
{
Rectangle2D box1 = textString1.getBox();
Rectangle2D box2 = textString2.getBox();
if(isOnTheSameLine(box1,box2))
{
/*
[FIX:55:0.1.3] In order not to violate the transitive condition, equivalence on x-axis
MUST fall back on y-axis comparison.
*/
int xCompare = Double.compare(box1.getX(), box2.getX());
if(xCompare != 0)
return xCompare;
}
return Double.compare(box1.getY(), box2.getY());
}
(http://svn.code.sf.net/p/clown/code/trunk/java/pdfclown.lib/src/org/pdfclown/tools/TextExtractor.java at revision 121)
This fix has been introduced on May 5th, 2014. If you have a PDFClown version from before 0.1.3 or a version 0.1.3 built before that date, you should update PDFClown.