Search code examples
cassandracassandra-3.0cqlshnodetool

How to resolve the cassandra node issue


I created a three Cassandra node initially it's working fine, but now 2 nodes are stop working. I tried

sudo service dse stop

and

sudo service dse start

got below error

Job for dse.service failed because the control process exited with error code.
See "systemctl status dse.service" and "journalctl -xe" for details.
systemctl status dse.service
● dse.service - LSB: DataStax Enterprise
   Loaded: loaded (/etc/init.d/dse; generated)
   Active: failed (Result: exit-code) since Tue 2020-03-17 04:34:24 UTC; 4min 43s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 4263 ExecStop=/etc/init.d/dse stop (code=exited, status=0/SUCCESS)
  Process: 11273 ExecStart=/etc/init.d/dse start (code=exited, status=1/FAILURE)
    Tasks: 0 (limit: 4915)
   CGroup: /system.slice/dse.service

Mar 17 04:34:14 cstar-node1 su[11442]: pam_unix(su:session): session closed for user cassandra
Mar 17 04:34:14 cstar-node1 su[11456]: Successful su for cassandra by root
Mar 17 04:34:14 cstar-node1 su[11456]: + ??? root:cassandra
Mar 17 04:34:14 cstar-node1 su[11456]: pam_unix(su:session): session opened for user cassandra by (uid=0)
Mar 17 04:34:14 cstar-node1 su[11456]: pam_unix(su:session): session closed for user cassandra
Mar 17 04:34:24 cstar-node1 dse[11273]: ERROR: DSE failed to start. Please check your logs.
Mar 17 04:34:24 cstar-node1 dse[11273]:    ...fail!
Mar 17 04:34:24 cstar-node1 systemd[1]: dse.service: Control process exited, code=exited status=1
Mar 17 04:34:24 cstar-node1 systemd[1]: dse.service: Failed with result 'exit-code'.
Mar 17 04:34:24 cstar-node1 systemd[1]: Failed to start LSB: DataStax Enterprise.

only one node is UP

nodetool status
Datacenter: Cassandra
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns    Host ID                               Rack
DN  X.X.X.X   ?          1            ?       46fdfb5e-238c-476b-a243-184a530fg30e  rack1
UN  X.X.X.Y  207.4 KiB  1            ?       7fasd242-891d-4ecf-ggef-0f8hffarr434  rack1
DN  X.X.X.Z  ?          1            ?       34ffda2f-46d2-443d-4546-33c55cface2c  rack1

how to resolve this error? can anyone help me.

Thanks in advance.


Solution

  • It's some time ago - so even if it is not of help for you anymore, it might for others. I had the same issue but there hasn't been any entry in the cassandra logs nor in the system logs. Also the start process failed with that non-descriptive message above. To resolve the issue I've been stopping (as root):

    • the agent: systemctl stop datastax-agent and
    • the dse service: systemctl stop dse

    Then deleted the directories where the PIDs are located:

    • /var/run/datastax-agent
    • /var/run/dse

    And finally restarted both services. That did the trick for me. I cannot say if the deletion of the PIDs or restarting the datastax-agent actually resolved the problem but I my blind guess would fall on the PIDs.