Search code examples
hivebrisk

Brisk TaskTracker not starting in a multi-node Brisk setup


I have a 3 node Brisk cluster (Briskv1.0_beta2). Cassandra is working fine (all three nodes see each other and data is balanced across the ring). I started the nodes with the brisk cassandra -t command. I cannot, however, run any Hive or Pig jobs. When I do, I get an exception saying that it cannot connect to the task tracker.

During the startup process, I see the following in the log:

TaskTracker.java (line 695) TaskTracker up at: localhost.localdomain/127.0.0.1:34928

A few lines later, however, I see this:

Retrying connect to server: localhost.localdomain/127.0.0.1:8012. Already tried 9 time(s).
INFO [TASK-TRACKER-INIT] RPC.java (line 321) Server at localhost.localdomain/127.0.0.1:8012 not available yet, Zzzzz... 

Those lines are repeated non-stop as long as my cluster is running.

My cassandra.yaml file specifies the box IP (not 0.0.0.0 or localhost) as the listen_address and the rpc_address is set to 0.0.0.0

Why is the client attempting to connect to a different port than the log shows the task tracker as using? Is there anywhere these addresses/ports can be specified?


Solution

  • I figured this out. In case anyone else has the same issues, here's what was going on: Brisk uses the first entry in the Cassandra cluster's seed list to pick the initial jobtracker. One of my nodes had 127.0.0.1 in the seed list. This worked for the Cassandra setup since all the other nodes in the cluster connected to that box to get the cluster topology but this didn't work for the job tracker selection.