I'm having a text file that I need to import into my Elasticsearch. My text file format is:
1 ARsv200711042 Allen Alane
2 ARsv200711042 Allen Arthur
3 ARsv200711042 Allen Bernice
4 ARsv200711042 Allen Betty
5 ARsv200711042 Allen Brittany
6 ARsv200711042 Allen Bruce
7 ARsv200711042 Allen Carolyn
8 ARsv200711042 Allen Carolyn
9 ARsv200711042 Allen Chadderick
10 ARsv200711042 Allen Darlene
I need to capture the data concerning the position; for example, the first column is eMID, which is from 1st position to 13th position, I've StateSource is at position 14-15, CodeProducts is at position 16-17, and so on.
So I made Logstash configuration something like this:
input {
file {
path => "D:/sample/sample 500.txt"
start_position => "beginning"
}
}
filter {
grok {
match => {
"message" => [
"(?<eMID>.{0,13})(?<StateSource>.{0,2})(?<CodeProducts>.{0,2})(?<AcquiredDate>.{0,8})(?<Uses>.{0,2})(?<Prefix>.{0,10})(?<LName>.{0,30})(?<FName>.{0,30})"
]
}
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "sample-data"
#user => "elastic"
#password => "changeme"
}
}
I was able to import the data successfully. I've the following questions:
20071104
which needs to be transformed into date format, which elasticsearch can analyzeFirstname/FName
or Lastname/LName
may contain special characters such as + - && || ! ( ) { } [ ] ^ " ~ * ? : \
etc, how can I also match those with regex and insert into elasticsearch.ok so one way is to split 20071104
into four parts \d{4}
and assign this to y, and next two digits \d{2}
to m and remaining two digits \d{2}
to d and frame a date object
or second way is to create a date from the string and using that object to reformat like in this example I did, assuming AcquiredDate is 20071104
filter {
ruby {
code => '
date = Date.strptime(event.get("AcquiredDate"), "%Y%m%d")
event.set("new_time", date.strftime("%Y-%m-%d"))
'
}
mutate {
remove_field =>
["host","@timestamp","sequence","message","@version"]
}
}
gives you
{
"AcquiredDate" => "20071104",
"new_time" => "2007-11-04"
}
to answer your second part
use something like this
mutate {
strip => ["field1withwhitespace", "field2withwhitespace"]
}