I know Solr is meant to be used for searching.
However, I am doing some benchmarking and I wonder if there is a way to retrieve doc id of every document indexed.
The best option is retrieving without searching (if there exist a way).
I guess the alternative is to query all documents but only asks for doc id.
I will be using SolrJ, so operations of SolrJ would be useful
Use the /export
end point: Exporting result sets.
It supports using the same fl
parameter as regular search (although searching for just *:*
will probably behave quite similar when you're using SolrJ).
In SolrJ you'll have to use the CloudSolrStream
class instead to properly stream the results (as compared to the regular behavior when searching for *:*
).
From Joel Bernstein's example when introducing the feature:
import org.apache.solr.client.solrj.io.*;
import java.util.*;
public class StreamingClient {
public static void main(String args[]) throws IOException {
String zkHost = args[0];
String collection = args[1];
Map props = new HashMap();
props.put("q", "*:*");
props.put("qt", "/export");
props.put("sort", "fieldA asc");
props.put("fl", "fieldA,fieldB,fieldC");
CloudSolrStream cstream = new CloudSolrStream(zkHost,
collection,
props);
try {
cstream.open();
while(true) {
Tuple tuple = cstream.read();
if(tuple.EOF) {
break;
}
String fieldA = tuple.getString("fieldA");
String fieldB = tuple.getString("fieldB");
String fieldC = tuple.getString("fieldC");
System.out.println(fieldA + ", " + fieldB + ", " + fieldC);
}
} finally {
cstream.close();
}
}
}