When using Stormcrawler it is indexing to Elasticsearch, but not the content.
Stormcrawler is up-to-date with 'origin/master' https://github.com/DigitalPebble/storm-crawler.git
Using elasticsearch-5.6.4
crawler-conf.yaml has
indexer.url.fieldname: "url"
indexer.text.fieldname: "content"
indexer.canonical.name: "canonical"
The url and title fields are indexed, but not content.
I have trying to get this working by following Julien's tutorial at: https://www.youtube.com/watch?v=xMCuWpPh-4A
Everything is working, except for the content is not being indexed into Elasticsearch. I feel like this is some small config error, but I have tried many variations with no luck. So, now I seek help.
Thanks.
Are you sure that the content is not indexed? The content field is not stored, see ES_IndexInit.sh but it should be indexed. To store it, you can modify the init script and re-run the crawl, you'd then get it back same as the other fields. To test that it is indexed, try querying on it and see how it affects the results.