elasticsearch logstash kibana elastic-stack log-analysis

Program to generate sample log to feed to logstash?

I have written a small java program which generates some dummy logs (writes stuff to a txt file basically). Now I want to feed this data to the ELK stack. Basically logstash should read this data from the txt file and I want to visualize these changes on kibana, just to get a feel of it.

What I basically want to do is then change the speed at which my program writes the dummy logs to the txt file so that I can see the changes on kibana.

I have just started exploring the ELK stack and this might be a completely wrong way to do this kind of analysis. Please do suggest if there are other better ways to do this (considering I don't have actual logs to work with right now)

Edit : @Val

input {
    generator {
        message => “’83.149.9.216 - - [17/May/2015:10:05:03 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36””
        count => 10
    }
}

So here is my logstash.conf:

input {

 stdin { }

}


filter {
  grok {
    match => {
      "message" => '%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:agent}'
    }
  }

  date {
    match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ]
    locale => en
  }

  geoip {
    source => "clientip"
  }

  useragent {
    source => "agent"
    target => "useragent"
  }
}

output {
  stdout {
codec => plain {
                        charset => "ISO-8859-1"
                }

}
  elasticsearch {
    hosts => "http://localhost:9200"
    index => "apache_elk_example"
    template => "./apache_template.json"
    template_name => "apache_elk_example"
    template_overwrite => true
  }
}

Now after starting elasticsearch and kabana I do:

cat apache_logs | /usr/local/opt/logstash/bin/logstash -f apache_logs

where apache_logs is been fed from my java program:

public static void main(String[] args) {
    // TODO Auto-generated method stub
    try {
        PrintStream out = new PrintStream(new FileOutputStream("/Users/username/Desktop/user/apache_logs"));
        System.setOut(out);
    } catch (FileNotFoundException ex) {
        System.out.print("Exception");
    }
    while(true)
    //for(int i=0;i<5;++i)
    {
        System.out.println(generateRandomIPs() + //other log stuff);
        try {
            Thread.sleep(1000);                 //1000 milliseconds is one second.
        } catch(InterruptedException ex) {
            Thread.currentThread().interrupt();
        }
    }
}

So here is the problem :

Kibana doesn't show me real time visualization i.e. as and when my java program feeds data into the apache_log file it does not show it to me. It only shows only until whatever data was already written into 'apache_log' at the time of execution of :

cat apache_logs | /usr/local/opt/logstash/bin/logstash -f apache_logs

Solution

might be a bit late but I wrote up a small sample of what I meant.

I modified your java program to add a timestamp like this:

public class LogWriter {


    public static Gson gson = new Gson();

    public static void main(String[] args) {

        try {
            PrintStream out = new PrintStream(new FileOutputStream("/var/logstash/input/test2.log"));
            System.setOut(out);
        } catch (FileNotFoundException ex) {
            System.out.print("Exception");
        }

        Map<String, String> timestamper = new HashMap<>();

        while(true)
        {

            String format = LocalDateTime.now().format(DateTimeFormatter.ISO_DATE_TIME);

            timestamper.put("myTimestamp", format);
            System.out.println(gson.toJson(timestamper));
            try {
                Thread.sleep(1000);                 //1000 milliseconds is one second.
            } catch(InterruptedException ex) {
                Thread.currentThread().interrupt();
            }
        }

    }
}

This now write json like:

{"myTimestamp":"2016-06-10T10:42:16.299"}
{"myTimestamp":"2016-06-10T10:42:17.3"}
{"myTimestamp":"2016-06-10T10:42:18.301"}

I then setup logstash to read that file and parse it and output to stdout:

input {
  file {
     path => "/var/logstash/input/*.log"
     start_position => "beginning"
     ignore_older => 0
     sincedb_path => "/dev/null"
  }   
}

filter {
   json {
      source => "message"
   }
}

output {
    file {
           path => "/var/logstash/out.log"
    }
    stdout { codec => rubydebug }
}

So it'll pick up my log, which knows when it was created, parses it, and creates a new timestamp which represents when it saw the log:

{
        "message" => "{\"myTimestamp\":\"2016-06-10T10:42:17.3\"}",
       "@version" => "1",
     "@timestamp" => "2016-06-10T09:42:17.687Z",
           "path" => "/var/logstash/input/test2.log",
           "host" => "pandaadb",
    "myTimestamp" => "2016-06-10T10:42:17.3"
}
{
        "message" => "{\"myTimestamp\":\"2016-06-10T10:42:18.301\"}",
       "@version" => "1",
     "@timestamp" => "2016-06-10T09:42:18.691Z",
           "path" => "/var/logstash/input/test2.log",
           "host" => "pandaadb",
    "myTimestamp" => "2016-06-10T10:42:18.301"
}

Here you can now see how long it takes for a log to be seen an processed. Which is around 300 miliseconds, which I would account to the fact that your java writer is an async writer and will not flush right away.

You can even make this a bit "cooler" by using the elapsed plugin which will calculate the difference between those timestamps for you.

I hope that helps for your testing :) Might not be the most advanced way of doing it, but it's easy to understand and pretty forward and fast.

Artur