The problem is to push json logs collected by Filebeat to Elasticsearch with defined _type and _id. Default elastic _type is "log" and _id is smth. like "AVryuUKMKNQ7xhVUFxN2".
My log row:
{"unit_id":10001,"node_id":1,"message":"Msg ..."}
Desired record in Elasticsearch:
"hits" : [ {
"_index" : "filebeat",
"_type" : "unit_id",
"_id" : "10001",
...
"_source" : {
"message" : "Msg ...",
"node_id" : 1,
...
}
} ]
I know how to do it with Logstash, just use document_id => "%{unit_id}" and document_type => "unit_id" in the output section. The goal is to use only Filebeat. Because it is a very-light weight solution and no intermediate aggregation is needed here.
You can set a custom _type
by using the document_type
option in Filebeat.
There is no way to set the _id
directly in Filebeat as of version 5.x.
filebeat.prospectors:
- paths: ['/var/log/messages']
document_type: syslog
You could use the Elasticsearch Ingest Node feature to set the _id
field. You would need to use a script processor to copy a value from the event into the _id
field. Once you have defined your pipeline you would tell Filebeat to send its data to that pipeline using the output.elasticsearch.pipeline
config option.