Search code examples
cassandracluster-computingnetflixpriam

no other nodes seen on priam cluster


     I am using cassandra 1.2.1 and I am trying to set up a Priam cluster on AWS with two nodes. However, I can't get both nodes up and running because of a weird error (at least to me).       When I start both nodes, they are both able to connect to each other and do some communication. However, after some seconds, I just see Java.lang.RuntimeException: No other nodes seen!, so they disconnect and die. I tried to test all ports (7000, 9160 and  7199) between both nodes and there is no firewall. On the second node, before the above exception, I get a broken pipe, as shown bellow.       Any hint? 

DEBUG 18:54:31,776 attempting to connect to /10.224.238.170
DEBUG 18:54:32,402 Reseting version for /10.224.238.170
DEBUG 18:54:32,778 Connection version 6 from /10.224.238.170
DEBUG 18:54:32,779 Upgrading incoming connection to be compressed
DEBUG 18:54:32,779 Max version for /10.224.238.170 is 6
DEBUG 18:54:32,779 Setting version 6 for /10.224.238.170
DEBUG 18:54:32,780 set version for /10.224.238.170 to 6
DEBUG 18:54:33,455 Disseminating load info ...
DEBUG 18:54:59,082 Reseting version for /10.224.238.170
DEBUG 18:55:00,405 error writing to /10.224.238.170
java.io.IOException: Broken pipe
    at sun.nio.ch.FileDispatcher.write0(Native Method)
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72)
    at sun.nio.ch.IOUtil.write(IOUtil.java:43)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
    at java.nio.channels.Channels.writeFullyImpl(Channels.java:59)
    at java.nio.channels.Channels.writeFully(Channels.java:81)
    at java.nio.channels.Channels.access$000(Channels.java:47)
    at java.nio.channels.Channels$1.write(Channels.java:155)
    at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
    at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
    at org.xerial.snappy.SnappyOutputStream.flush(SnappyOutputStream.java:272)
    at java.io.DataOutputStream.flush(DataOutputStream.java:106)
    at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:189)
    at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:143)
DEBUG 18:55:01,405 attempting to connect to /10.224.238.170
DEBUG 18:55:01,461 Started replayAllFailedBatches
DEBUG 18:55:01,462 forceFlush requested but everything is clean in batchlog
DEBUG 18:55:01,463 Finished replayAllFailedBatches
 INFO 18:55:01,472 JOINING: schema complete, ready to bootstrap
DEBUG 18:55:01,473 ... got ring + schema info
 INFO 18:55:01,473 JOINING: getting bootstrap token
ERROR 18:55:01,475 Exception encountered during startup
java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed.  Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster.  Usually, this can be solved by giving all nodes the same seed list.

and on the first node:

DEBUG 18:54:30,833 Disseminating load info ...
DEBUG 18:54:31,532 Connection version 6 from /10.242.139.159
DEBUG 18:54:31,533 Upgrading incoming connection to be compressed
DEBUG 18:54:31,534 Max version for /10.242.139.159 is 6
DEBUG 18:54:31,534 Setting version 6 for /10.242.139.159
DEBUG 18:54:31,534 set version for /10.242.139.159 to 6
DEBUG 18:54:31,542 Reseting version for /10.242.139.159
DEBUG 18:54:31,791 Connection version 6 from /10.242.139.159
DEBUG 18:54:31,792 Upgrading incoming connection to be compressed
DEBUG 18:54:31,792 Max version for /10.242.139.159 is 6
DEBUG 18:54:31,792 Setting version 6 for /10.242.139.159
DEBUG 18:54:31,793 set version for /10.242.139.159 to 6
 INFO 18:54:32,414 Node /10.242.139.159 is now part of the cluster
DEBUG 18:54:32,415 Resetting pool for /10.242.139.159
DEBUG 18:54:32,415 removing expire time for endpoint : /10.242.139.159
 INFO 18:54:32,415 InetAddress /10.242.139.159 is now UP
DEBUG 18:54:32,789 attempting to connect to ec2-75-101-233-115.compute-1.amazonaws.com/10.242.139.159
DEBUG 18:54:58,840 Started replayAllFailedBatches
DEBUG 18:54:58,842 forceFlush requested but everything is clean in batchlog
DEBUG 18:54:58,842 Finished replayAllFailedBatches
 INFO 18:54:58,852 JOINING: schema complete, ready to bootstrap
DEBUG 18:54:58,853 ... got ring + schema info
 INFO 18:54:58,853 JOINING: getting bootstrap token
java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed.  Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster.  Usually, this can be solved by giving all nodes the same seed list.
    at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:154)
    at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:135)
    at org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:115)
    at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:620)
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:508)
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:406)
    at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:282)
    at org.apache.cassandra.service.CassandraDaemon.init(CassandraDaemon.java:315)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:212)
Cannot load daemon
Service exit with a return value of 3

UPDATE: added cassandra config on both nodes

Cassandra config on node 1:

cluster_name: dmp_cluster
initial_token: null
hinted_handoff_enabled: true
max_hint_window_in_ms: 8
hinted_handoff_throttle_in_kb: 1024
max_hints_delivery_threads: 2
authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
authorizer: org.apache.cassandra.auth.AllowAllAuthorizer
partitioner: org.apache.cassandra.dht.RandomPartitioner
data_file_directories:
- /var/lib/cassandra/data
commitlog_directory: /var/lib/cassandra/commitlog
disk_failure_policy: stop
key_cache_size_in_mb: null
key_cache_save_period: 14400
row_cache_size_in_mb: 0
row_cache_save_period: 0
row_cache_provider: SerializingCacheProvider
saved_caches_directory: /var/lib/cassandra/saved_caches
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32
seed_provider:
- class_name: com.netflix.priam.cassandra.extensions.NFSeedProvider
  parameters:
  - seeds: 127.0.0.1
flush_largest_memtables_at: 0.75
reduce_cache_sizes_at: 0.85
reduce_cache_capacity_to: 0.6
concurrent_reads: 32
concurrent_writes: 32
memtable_flush_queue_size: 4
trickle_fsync: false
trickle_fsync_interval_in_kb: 10240
storage_port: 7000
ssl_storage_port: 7001
listen_address: null
start_native_transport: false
native_transport_port: 9042
start_rpc: true
rpc_address: null
rpc_port: 9160
rpc_keepalive: true
rpc_server_type: sync
thrift_framed_transport_size_in_mb: 15
thrift_max_message_length_in_mb: 16
incremental_backups: true
snapshot_before_compaction: false
auto_snapshot: true
column_index_size_in_kb: 64
in_memory_compaction_limit_in_mb: 128
multithreaded_compaction: false
compaction_throughput_mb_per_sec: 8
compaction_preheat_key_cache: true
read_request_timeout_in_ms: 10000
range_request_timeout_in_ms: 10000
write_request_timeout_in_ms: 10000
truncate_request_timeout_in_ms: 60000
request_timeout_in_ms: 10000
cross_node_timeout: false
endpoint_snitch: org.apache.cassandra.locator.Ec2Snitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_badness_threshold: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
index_interval: 128
server_encryption_options:
  internode_encryption: none
  keystore: conf/.keystore
  keystore_password: cassandra
  truststore: conf/.truststore
  truststore_password: cassandra
client_encryption_options:
  enabled: false
  keystore: conf/.keystore
  keystore_password: cassandra
internode_compression: all
inter_dc_tcp_nodelay: true
auto_bootstrap: true
memtable_total_space_in_mb: 1024
stream_throughput_outbound_megabits_per_sec: 400
num_tokens: 1

Cassandra config on node 2:

cluster_name: dmp_cluster
initial_token: null
hinted_handoff_enabled: true
max_hint_window_in_ms: 8
hinted_handoff_throttle_in_kb: 1024
max_hints_delivery_threads: 2
authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
authorizer: org.apache.cassandra.auth.AllowAllAuthorizer
partitioner: org.apache.cassandra.dht.RandomPartitioner
data_file_directories:
- /var/lib/cassandra/data
commitlog_directory: /var/lib/cassandra/commitlog
disk_failure_policy: stop
key_cache_size_in_mb: null
key_cache_save_period: 14400
row_cache_size_in_mb: 0
row_cache_save_period: 0
row_cache_provider: SerializingCacheProvider
saved_caches_directory: /var/lib/cassandra/saved_caches
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32
seed_provider:
- class_name: com.netflix.priam.cassandra.extensions.NFSeedProvider
  parameters:
  - seeds: 127.0.0.1
flush_largest_memtables_at: 0.75
reduce_cache_sizes_at: 0.85
reduce_cache_capacity_to: 0.6
concurrent_reads: 32
concurrent_writes: 32
memtable_flush_queue_size: 4
trickle_fsync: false
trickle_fsync_interval_in_kb: 10240
storage_port: 7000
ssl_storage_port: 7001
listen_address: null
start_native_transport: false
native_transport_port: 9042
start_rpc: true
rpc_address: null
rpc_port: 9160
rpc_keepalive: true
rpc_server_type: sync
thrift_framed_transport_size_in_mb: 15
thrift_max_message_length_in_mb: 16
incremental_backups: true
snapshot_before_compaction: false
auto_snapshot: true
column_index_size_in_kb: 64
in_memory_compaction_limit_in_mb: 128
multithreaded_compaction: false
compaction_throughput_mb_per_sec: 8
compaction_preheat_key_cache: true
read_request_timeout_in_ms: 10000
range_request_timeout_in_ms: 10000
write_request_timeout_in_ms: 10000
truncate_request_timeout_in_ms: 60000
request_timeout_in_ms: 10000
cross_node_timeout: false
endpoint_snitch: org.apache.cassandra.locator.Ec2Snitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_badness_threshold: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
index_interval: 128
server_encryption_options:
  internode_encryption: none
  keystore: conf/.keystore
  keystore_password: cassandra
  truststore: conf/.truststore
  truststore_password: cassandra
client_encryption_options:
  enabled: false
  keystore: conf/.keystore
  keystore_password: cassandra
internode_compression: all
inter_dc_tcp_nodelay: true
auto_bootstrap: true
memtable_total_space_in_mb: 1024
stream_throughput_outbound_megabits_per_sec: 400
num_tokens: 1

Solution

  • You have your seeds on both machines set to 127.0.0.1, so neither machine will be able to see the other. You also have listen and rpc addresses null, which may be problematic if your hosts are not configured properly. My advice is to set the listen and rpc addresses, then set both nodes' seeds to whichever one you start first.