I am collecting log data using filebeat 7.x but I am facing a problem that the log size is so big (100GB per day).
Now I am thinking how we can collect the error level log from the source file. What is the best way to do this?
I am using filebeat to send logs to elasticsearch which is in Kubernetes cluster, my concern here is should I must use kafka and logstash to define the rule?
Please find below the filebeat config file being used:
{
"filebeat.yml": "filebeat.inputs:
- type: container
paths:
- /var/log/containers/*.log
processors:
- add_kubernetes_metadata:
host: ${NODE_NAME}
matchers:
- logs_path:
logs_path: \"/var/log/containers/\"
output.elasticsearch:
host: '${NODE_NAME}'
hosts: '${ELASTICSEARCH_HOSTS:elasticsearch-master:9200}'
"
}
I would recommend you to configure the flow as :
Filebeat → Kafka → Logstash → ElasticSearch → Kibana
Filebeat reads & push logs from your server/s to Kafka topic/s as configured.
Then, Logstash will subscribe to those logs from kafka topic and perform parsing/filtering/formatting/exclude and include fields as per requirement and send processed log data to Elasticsearch Index.
Visualize you data via dashboard