Search code examples
authenticationsolrsolrcloud

Solr CDCR doesn't work if the authentication is enabled


I set up CDCR in my test environment and it worked perfectly until I uploaded security.json files to Zookeeper clusters of a Target and a Source SolrClouds. security.json files are identical for both Clouds as well as collections name. The Source has the next errors:

Request to collection col01 failed due to (401) 

org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from server at http://targethost:port/solr/col01_shard1_replica1: Expected mime type application/octet-stream but got text/html. <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 401 Unauthorized request, Response code: 401</title>
</head>
<body><h2>HTTP ERROR 401</h2>
<p>Problem accessing /solr/col01_shard1_replica1/update. Reason:
<pre>    Unauthorized request, Response code: 401</pre></p>
</body>
</html>

    at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:819)
    at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1263)
    at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1134)
    at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1073)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:160)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:177)
    at org.apache.solr.handler.CdcrReplicator.sendRequest(CdcrReplicator.java:136)
    at org.apache.solr.handler.CdcrReplicator.run(CdcrReplicator.java:116)
    at org.apache.solr.handler.CdcrReplicatorScheduler.lambda$null$0(CdcrReplicatorScheduler.java:81)
    at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://targethost:port/solr/col01_shard1_replica1: Expected mime type application/octet-stream but got text/html. <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 401 Unauthorized request, Response code: 401</title>
</head>
<body><h2>HTTP ERROR 401</h2>
<p>Problem accessing /solr/col01_shard1_replica1/update. Reason:
<pre>    Unauthorized request, Response code: 401</pre></p>
</body>
</html>

    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:578)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:279)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:268)
    at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:447)
    at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:388)
    at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$directUpdate$0(CloudSolrClient.java:796)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    ... 4 more

Any idea how should I fix it? Thanks!


Solution

  • I had the same problem, it appears that the target SolrCloud is trying to authenticate incoming requests from the source using PKIAuthenticationPlugin, but since neither of the source nodes are registered on the target by default, the latter gives the following error:

    ERROR (qtp1265210847-32) [ ] o.a.s.s.PKIAuthenticationPlugin Decryption failed , key must be wrong

    After some investigation of the PKIAuthenticationPlugin.java, I found out that the following piece of code would reject the authentication request:

    Public Key getRemotePublicKey(String nodename) {
      if(!cores.getZkController().getZkStateReader().getClusterState(). getLiveNodes().contains(nodename)) 
    return null;
    }
    

    So, to overcome this, I simply added the node names (located in /live_nodes section) of the source to the target Zookeeper, and the replication started working with security.json enforced on both sides.