Search code examples
indexingsolrlucenesolrj

Reading internals of Solr index file in Java


I am trying to read a Solr index file. This file is created by an example from Solr download pages in version 6.4.
I am using this code:

    import java.io.File;
    import java.io.IOException;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.index.IndexReader;
    import org.apache.lucene.store.Directory;
    import org.apache.lucene.store.FSDirectory;

    public class TestIndex {
        public static void main(String[] args) throws IOException {


            Directory dirIndex = FSDirectory.open(new File("D:\\data\\data\\index"));
            IndexReader indexReader = IndexReader.open(dirIndex);
            Document doc = null;   

            for(int i = 0; i < indexReader.numDocs(); i++) {
                doc = indexReader.document(i);
            }

            System.out.println(doc.toString());

            indexReader.close();
            dirIndex.close();
        }
    }  

Solr jar : solr-solrj-6.5.1.jar
Lucene : lucene-core-r1211247.jar

Exception :

Exception in thread "main" 
org.apache.lucene.index.IndexFormatTooOldException: Format version is not 
supported (resource: 
ChecksumIndexInput(MMapIndexInput(path="D:\data\data\index\segments_2"))): 
1071082519 (needs to be between -9 and -12). This version of Lucene only 
supports indexes created with release 3.0 and later.

Updated code with lucene 6.5.1

Path path = FileSystems.getDefault().getPath("D:\\data\\data\\index");
Directory dirIndex = FSDirectory.open(path);
DirectoryReader  dr  = DirectoryReader.open(dirIndex);
Document doc = null;   

    for(int i = 0; i < dr.numDocs(); i++) {
        doc = dr.document(i);
    }

    System.out.println(doc.toString());

    dr.close();
    dirIndex.close();  

Exception :

java.lang.UnsupportedClassVersionError: org/apache/lucene/store/Directory : Unsupported major.minor version 52.0.

Could you please help me to run this code?

Thanks
Virendra Agarwal


Solution

  • I suggest to use Luke.

    https://github.com/DmitryKey/luke

    Luke is the GUI tool for introspecting your Lucene / Solr / Elasticsearch index. It allows:

    • Viewing your documents and analyzing their field contents (for stored fields) Searching in the index
    • Performing index maintenance: index health checking, index optimization (take a - backup before running this!)
    • Reading index from hdfs
    • Exporting the index or portion of it into an xml format
    • Testing your custom Lucene analyzers
    • Creating your own plugins!