curl
correctly posts data to Solr:
$ curl -v 'http://solr.example.no:12699/solr/my_coll/update?commit=true' \
--data '<add><doc><field name="key">KEY__9927.1</field><field name="value">\
{"result":0,"jobId":"9459695","jobNumber":"9927.1"}</field></doc></add>'
The solr query log says:
[20200306T111354,131] [my_coll_shard1_replica_n85] webapp=/solr path=/update params={commit=true} status=0 QTime=96
I'm trying to do the same thing with Python:
>>> import urllib.request
>>> data = '<add><doc><field name="key">KEY__9927.1</field><field name="value">{"result":0,"jobId":"9459695","jobNumber":"9927.1"}</field></doc></add>'
>>> url = 'http://solr.example.no:12699/solr/my_coll/update?commit=true'
>>> req = urllib.request.Request(url=url, data=data.encode('utf-8'), method='POST')
>>> res = urllib.request.urlopen(req)
But now the solr query log shows that the POST data has been added to the query param string:
[20200306T112358,780] [my_coll_shard1_replica_n87] webapp=/solr path=/update params={commit=true&<add><doc><field+name="key">KEY__9927.1</field><field+name%3D"value">{"result":0,"jobId":"9459695","jobNumber":"9927.1"}</field></doc></add>} status=0 QTime=30
What is happening here?
The issue is that you're not sending the correct Content-Type for your request, and this gets mangled within Jetty (or the Solr app) before being forwarded to the log (any POSTed data that's not multipart can be inserted as part of the query string - Solr parses them all the same). The /update
endpoint accepts multiple formats, such as both JSON and XML, and the Content-Type should be set appropriately.
req = urllib.request.Request(url=url, data=data.encode('utf-8'), method='POST', headers={'Content-Type': 'text/xml'})
res = urllib.request.urlopen(req)
In fact, it's the User-Agent
string that changes the behaviour. This is by design - curl
has been special cased to override the default Content-Type
handler. If you're not using curl
, you have to explicitly provide the Content-Type
being submitted. This has probably been done to make it easier to make manual requests using curl
on the command line. The implementation is available in SolrRequestParsers.java, line 782