I have written the logstash config file to upload a csv, csv has multiple applicant informations, I need to upload as array of dictionary in the kibana index instead of being a dictionary of dict with index.
filter {
csv {
separator => ","
skip_header => true
columns => [LoanID,Applicant_Income1,Occupation1,Time_At_Work1,Date_Of_Join1,Gender,LoanAmount,Marital_Status,Dependents,Education,Self_Employed,Applicant_Income2,Occupation2,Time_At_Work2,Date_Of_Join2,Applicant_Income3,Occupation3,Time_At_Work3,Date_Of_Join3]
}
mutate {
convert => {
"Applicant_Income1" => "float"
"Time_At_Work1" => "float"
"LoanAmount" => "float"
"Applicant_Income2" => "float"
"Time_At_Work2" => "float"
"Applicant_Income3" => "float"
"Time_At_Work3" => "float"
}
}
mutate{
rename => {
"Applicant_Income1" => "[Applicant][0][Applicant_Income]"
"Occupation1" => "[Applicant][0][Occupation]"
"Time_At_Work1" => "[Applicant][0][Time_At_Work]"
"Date_Of_Join1" => "[Applicant][0][Date_Of_Join]"
"Applicant_Income2" => "[Applicant][1][Applicant_Income]"
"Occupation2" => "[Applicant][1][Occupation]"
"Time_At_Work2" => "[Applicant][1][Time_At_Work]"
"Date_Of_Join2" => "[Applicant][1][Date_Of_Join]"
"Applicant_Income3" => "[Applicant][2][Applicant_Income]"
"Occupation3" => "[Applicant][2][Occupation]"
"Time_At_Work3" => "[Applicant][2][Time_At_Work]"
"Date_Of_Join3" => "[Applicant][2][Date_Of_Join]"
}
}
date {
match => [ "Date_Of_Join1", "yyyy-MM-dd'T'HH:mm:ss.SSZZ" ]
}
date {
match => [ "Date_Of_Join2", "yyyy-MM-dd'T'HH:mm:ss.SSZZ" ]
}
date {
match => [ "Date_Of_Join3", "yyyy-MM-dd'T'HH:mm:ss.SSZZ" ]
}
}
I got the Applicant field as
But I need the Applicant field to be an array of dictionaries, like
I tried add_field, but not working
mutate{
add_field => { "[Applicant][Applicant_Income1]" => "Applicant_Income1",
"[Applicant][Occupation1]" => "Occupation1",
"[Applicant][Time_At_Work1]" => "Time_At_Work1",
"[Applicant][Date_Of_Join1]" => "Date_Of_Join1"
}
}
The square brackets in Logstash Filters do not behave like array elements/entries as in other programming languages, e.g. Java.
[Applicant][0][Applicant_Income]
is not the right syntax to set the value of field Applicant_Income
of the first element (zero-based index) in the Applicant-Array. Instead, you create sub-elements 0, 1, 2 underneath the Applicant-element as shown in Figure 1.
To create an array of objects, you should use the ruby filter plugin (https://www.elastic.co/guide/en/logstash/current/plugins-filters-ruby.html). Since you can execute arbitrary ruby code with that filter, it gives you more control/freedom:
filter {
csv {
separator => ","
skip_header => true
columns => [LoanID,Applicant_Income1,Occupation1,Time_At_Work1,Date_Of_Join1,Gender,LoanAmount,Marital_Status,Dependents,Education,Self_Employed,Applicant_Income2,Occupation2,Time_At_Work2,Date_Of_Join2,Applicant_Income3,Occupation3,Time_At_Work3,Date_Of_Join3]
}
mutate {
convert => {
"Applicant_Income1" => "float"
"Time_At_Work1" => "float"
"LoanAmount" => "float"
"Applicant_Income2" => "float"
"Time_At_Work2" => "float"
"Applicant_Income3" => "float"
"Time_At_Work3" => "float"
}
}
ruby{
code => '
event.set("Applicant",
[
{
"Applicant_Income" => event.get("Applicant_Income1"),
"Occupation" => event.get("Occupation1"),
"Time_At_Work" => event.get("Time_At_Work1"),
"Date_Of_Join" => event.get("Date_Of_Join1")
},
{
# next object...
}
]
'
}
date {
match => [ "Date_Of_Join1", "yyyy-MM-dd'T'HH:mm:ss.SSZZ" ]
}
date {
match => [ "Date_Of_Join2", "yyyy-MM-dd'T'HH:mm:ss.SSZZ" ]
}
date {
match => [ "Date_Of_Join3", "yyyy-MM-dd'T'HH:mm:ss.SSZZ" ]
}
mutate{
remove_field => [
"Applicant_Income1",
"Occupation1",
"Time_At_Work1",
"Date_Of_Join1",
"Applicant_Income2",
"Occupation2",
"Time_At_Work2",
"Date_Of_Join2",
"Applicant_Income3",
"Occupation3",
"Time_At_Work3",
"Date_Of_Join3"
]
}
}
With event.set
you add a field to the document. The first argument is the fieldname, the second one its value. In this case, you add the field "Applicants" to the document with an array of objects as its value.
event.get
is used to get the value of a certain field in the document. You retrieve the value by passing the fieldname to the method.
Please refer to this guide https://www.elastic.co/guide/en/logstash/current/event-api.html to get more insights of the event API.
I hope I could help you.