Search code examples
javasolrsolrjsolr-cell

Set multivalued fields with ContentStreamUpdateRequest in Solr


I'm using SolrJ+SolrCell to index the contents of various Word/Excel/PDF files, but there are some fields (e.g. id, name) that I want to be able to set myself:

ContentStreamUpdateRequest req = new ContentStreamUpdateRequest("/update/extract");
req.addFile(docFile, null);
req.setParam("literal.id", docProperties.getId());
req.setParam("literal.name", docProperties.getName());

I am not having any issues with normal fields, but I find that when I try using this same setParam method to set multivalued fields, only the last element in the input array is stored:

if (docProperties.getCategories() != null) {
    for (String category : docProperties.getCategories()) {
        req.setParam("literal.categories", category);
    }
}

For example, if docProperties.getCategories() is ["News", "Computers", "Tech"], the only value stored in the multivalued category field is ["Tech"]. I'm actually not too surprised by this, since I don't think using the setParam method is the proper way to append values to a multivalued field.

However, I am at a loss as to how to do this using available ContentStreamUpdateRequest methods. If I was working with a SolrInputDocument, then it'd be a simple matter of passing an array to the addField method.

String[] categoriesArray = {"News", "Computers", "Tech"};
ArrayList<String> categories = new ArrayList<String>(Arrays.asList(categoriesArray));
doc.addField("categories", categories );

Is there a way to do this same sort of thing using ContentStreamUpdateRequest?


Solution

  • From http://wiki.apache.org/solr/ExtractingRequestHandler#SolrJ, using ModifiableSolrParams to set these literal parameters works for multivalued fields.