Search code examples
javacurlsolrsolrj

Apache Solr file not getting indexed or "uploaded"


I am using apache Solr and Java to attempt to index some files. I have been unsuccessful using Java and solrj. I am using version 5.2, but I have also tried with 5.1 and no success

I can use curl to send a file for indexing and then I can successfully search for this file with Solr. This is the command I use:

curl "http://solraddress/solr/my_core/update/extract?literal.id=testdoc&commit=true" -F "testfile=@/Users/lesson2.pdf"

As said, this works I can then search for this file and get it.

Using solrj I was attempting to use this code to send a file for indexing:

ContentStreamUpdateRequest req = new ContentStreamUpdateRequest("/update/extract");

req.addFile(myFile, "application/octet-stream");
req.setParam("literal.id", "testfile1.pdf");
req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);

NamedList<Object> result = solr.request(req);
System.out.println("Result: " + result);

This yields this error:

Error adding field 'stream_size'='null' msg=For input string: "null" using ContentStreamUpdateRequest

I could not find a solution for that error so I said, I'll just make my own wrapper to do this. I got the headers from my curl request, which were:

> POST solr/my_core/update/extract?literal.id=testdoc&commit=true HTTP/1.1
> User-Agent: curl/7.37.1
> Host: MyHost
> Accept: */*
> Content-Length: 220
> Expect: 100-continue
> Content-Type: multipart/form-data; boundary=------------------------aad460cc324256ec

and built a POST request to contain these headers and a multipart file in the body of the request doing so gives me a 200 response and the body:

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">137</int></lst>
</response>

Which seems like a positive response, as it matches the response my curl request gives me, yet the file does not appear to ever have been indexed, as I can not find it on solr.

Anyone have any idea?


Solution

  • It's a bug in Solr 5. There is an opened ticket on Solr JIRA to resolve this problem:

    SOLR-7498: Error adding field 'stream_size'='null' msg=For input string: "null" using ContentStreamUpdateRequest