I am going to change some field types in the schema, so seems it must re-index all the docs in current Solr index data with this kind of change.
The question is about how to "re-index" all the docs? One solution that I can think of is to "query" all docs through the search interface and dump a large file in XML or JSON, then convert it to the input XML format for Solr, and load it back to Solr again to make the schema change happen.
Is there some better way can do this more efficiently? Thanks for your suggestion.
First of all, dumping the results of a query may not give you the original data if you have fields that are indexed and not stored. In general, it is best to keep a copy of the input to SOLR in a form that you can easily use to rebuild indexes from scratch if you need to. In that case, just run a delete query by posting <delete><query>*:*</query></delete>
then <commit/>
and then <optimize/>
. After that your index is empty and you can add new documents that use the new schema.
But you may be able to get away with just running <optimize/>
after you restart SOLR with the new schema file. It would be good to have a backup where you can test that it works for your configuration.
There is a tool called Luke that can be used to browse and export Lucene indexes. I have never tried it myself, but it might be able to help you export your data so that you can reimport it.