Search code examples
elasticsearchweb-crawlerstormcrawler

No tuples is emitted or transffered by topology in storm ui


i am using stormcrawler 1.16 with elasticsearch 7.2.0. i have built project with with acrhetype. command i run to submitted topology

 storm jar target/stormcrawler-1.0-SNAPSHOT.jar  org.apache.storm.flux.Flux --remote es-crawler.flux

i am getting this in output -

 Parsing file: /home/ubuntu/stormcrawler/es-crawler.flux
 835  [main] INFO  o.a.s.f.p.FluxParser - loading YAML from input 
 stream...
 841  [main] INFO  o.a.s.f.p.FluxParser - Not performing property 
 substitution.
 841  [main] INFO  o.a.s.f.p.FluxParser - Not performing environment 
 variable substitution.
 900  [main] INFO  o.a.s.f.p.FluxParser - Loading includes from 
 resource: /crawler-default.yaml
 901  [main] INFO  o.a.s.f.p.FluxParser - loading YAML from input 
 stream...
 903  [main] INFO  o.a.s.f.p.FluxParser - Not performing property 
 substitution.
 903  [main] INFO  o.a.s.f.p.FluxParser - Not performing environment 
 variable substitution.
 Configuration (interpreted): 

then i last output lines -

   2014 [main] WARN  o.a.s.u.Utils - STORM-VERSION new 1.2.3 old 1.2.3
   2376 [main] INFO  o.a.s.StormSubmitter - Finished submitting topology: crawler

but when i check this crawler topology in storm ui, then in topology stats, no tuple is emitted or transffered by this crawler topology.

i have atteched a snapshot of storm ui in link below.

[in topology stats, no tuples is emitted or transffered. how can i solve this issue ? 1


Solution

  • Your POM file is probably missing the storm-crawler-elasticsearch dependency.

    You could compare your code with what is generated by the storm-crawler-elasticsearch-archetype, which should give you a working configuration.

    Use the archetype for Elasticsearch with:

    mvn archetype:generate -DarchetypeGroupId=com.digitalpebble.stormcrawler -DarchetypeArtifactId=storm-crawler-elasticsearch-archetype -DarchetypeVersion=LATEST

    You'll be asked to enter a groupId (e.g. com.mycompany.crawler), an artefactId (e.g. stormcrawler), a version and package name.

    This will not only create a fully formed project containing a POM with the dependency above but also a set of resources, configuration files and a topology class. Enter the directory you just created (should be the same as the artefactId you specified earlier) and follow the instructions on the README file.