I am a beginner in Solr. I have a scenario, where I need to index data from my MySQL db and need to query them. I have figured out to provide my db data import configs using DIH. I also know to query my index via SolrJ.
How can I do indexing via SorJ client for my db as well?
Is there any way I can make use of my configuration files and achieve the same. We need to use java APIs, so all indexing and querying can be done only via SolrJ.
If you just need to be able to open a connection to your Solr server for the indexing (and don't need to have your configuration files actually integrated with the SolrJ project), this is fairly simple to do.
First, you'll need to open a connection SolrJ, which is done as so:
HttpSolrServer solrServer = new HttpSolrServer("http://localhost:8983/solr");
Another option is to leverage Spring Data Solr's solr
schema and make a Solr server bean to do something like the following:
<solr:solr-server id="fullSearchIndex" url="${solrServiceBaseURL}/${solr.full.core}" />
And then you can just used the Autowired
annotation to use the bean wherever you need it. You could also define your own bean without using the solr
schema if you want. (All of this assumes, of course, that you're using Spring, which you may not be, but this is an option for those using the framework.)
Next, you need to tell SolrJ what your qt
and command
are using ModifiableSolrParams
, or possibly one of the other query classes:
params.set("qt", "/dataimport");
params.set("command", "full-import");
QueryResponse response = solrServer.query(params);
The above code tells Solrj to create a query that will perform a data import of the full-import
type.
I think it's worth pointing out that if you have many records to import, the SolrJ program may end before the import does. To check the status of your import, hit http://localhost:8983/solr/dataimport. In my experience, it takes the SolrJ program a few seconds to start up, send the import query, and end, but the actual process that it starts takes several minutes.
Also, since you need to use SolrJ for all your indexing, you'll need to think about when you're going to run your optimize
command after delta-import
. optimize
is a very expensive operation, as it temporarily doubles your index size. You may want to use something like Quartz to schedule the optimize
command to run once or twice a day at most. Personally, I use crons for delta-import
and optimize
.