Search code examples
apache-storm

Why is my bolt forgetting its name?


I've made a simple logging bolt, but it seems to forget the name provided in its constructor. Also, it seems that the constructor is somehow bypassed, because when I add some logging code to it, it does not log anything.

Is Apache Storm doing something weird with the bolt at some point?

    public class SimpleLoggingBolt extends BaseBasicBolt {
    private static final Logger LOG = LogManager.getRootLogger();
    private static String loggingBoltName;

    public SimpleLoggingBolt(String name) {
        super();
        LOG.info("This does not log anything")
        loggingBoltName = name;
    }

    @Override
    public void execute(Tuple input, BasicOutputCollector collector) {
        LOG.info("Bolt {} received tuple: {}", loggingBoltName, input);
        // Loga "Bolt null received tuple..."
    }

    @Override
    public void declareOutputFields(OutputFieldsDeclarer declarer) {}

Solution

  • The JVM that executes your bolt's constructor is not the same JVM that runs the execute method.

    When you submit your topology with storm jar, Nimbus starts a new JVM that runs your topology wiring code (the part that uses a TopologyBuilder and calls StormSubmitter). This includes running your bolt's constructor. I believe this JVM just logs to the terminal you use to run storm jar, which is why you don't see a log.

    Once your topology is submitted and validated, it gets serialized and sent over the network to the hosts running your supervisors (the machines you run storm supervisor on). Say a supervisor gets assigned to run one of your bolt instances. It will start a separate "worker" JVM, which will run your bolt (and probably also some other components). The worker JVM is where the actual work happens, and is where execute gets run.

    So what's happening for you is that your static loggingBoltName is initialized in the storm jar JVM, the bolt gets serialized, and when it gets deserialized in the worker JVM, the static field is null again because static fields are not serialized.

    If you want to keep the field value, you should not declare the field static. If the field can be serialized, and is not static or transient, it will keep the value you set in the constructor once the bolt is deserialized in the worker.

    Here's some related documentation https://storm.apache.org/releases/2.0.0-SNAPSHOT/Understanding-the-parallelism-of-a-Storm-topology.html