Search code examples
elasticsearchkibanadocker-swarmdocker-stackelk

Deploy ELK on single node docker swarm failed


I am trying to deploy ELK on my small server 2 Core / 2G RAM. But ELK stack server just keep restarting and cannot work.

The log printed on those container shows no error and just few warning about deprecated method.

Logstash log:
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.headius.backport9.modules.Modules (file:/usr/share/logstash/logstash-core/lib/jars/jruby-complete-9.2.7.0.jar) to field java.io.FileDescriptor.fd
WARNING: Please consider reporting this to the maintainers of com.headius.backport9.modules.Modules
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release

No error prints at Kibana and elasticsearch container

Here is the docker stack composer file: https://github.com/deviantony/docker-elk/blob/master/docker-stack.yml. I didn't not change anything except turn down the heap size.

But if I use docker-compose instead of docker stack deploy in swarm mode, everything goes smoothly.

Also, my CPU jump up to 100% while Memory usage only 60% when I startup the service.

How can I debug for this problem? Thanks in advance.


Solution

  • I think your problem is still caused by lack of memory. I'd test the compose stack you shows above. Check docker stats. The memory usage was fluctuating at 1.8G.

    You mentioned that you turn down the heap size in your compose file: from ES_JAVA_OPTS: "-Xmx512m -Xms512m" to lower. But still not recommend to cut down heap size below 256m. Any lower than that will cause some error like:

    [circuit_breaking_exception] [parent] Data too large, data for [<http_request>] would be xxx, which is larger than the limit of xxx
    

    Any more complicated query or other operation will throw more error.

    Besides, note that you got a single host, but you still using swarm as both master and work node. Any other redundant service or application will push you host to the edge of breakdown. 2G RAM server is not enough for host the whole ELK stack for most of common usage. If you insist, try add mem_limit in you compose file (you don't really need to use v3, v2 is enough for single node service) to limit your container memory usage.