You write a program in R or Python, which needs to run on Linux or Windows, you want to log (JSON structured and unstructured) std-out and (mostly unstructured) std-error from this program to a Fluentd instance. Adding a new program or starting another instance should not require to update the Fluentd configuration and the applications will not (yet) be running in a docker environment.
How to send "logs" from a bunch of programs to an fluentd instance, without the need to perform curl calls for every log entry that your application was originally writing to std-out?
When a UDP or TCP connection' is necessary for the application to run, it seems to become less easy to debug, and any dependency of your program that returns std-out will be required to be parsed, just to get it's logging passed through.
Alternatively, a question could be, how to accept a 'connection' object which can either point to a file or to a TCP connection? So that switching between the std-out or a TCP destination is a matter of changing a single value?
I like the 'tail' input plugin, which could be what I am looking for, but then:
I build an EFK stack with a docker logdriver set to fluentd, which does not seem to have an optimal solid solution either, but without docker, I already get kind of stuck with setting up a basic configuration (not referring to fluent.conf
here).
It's recommended to always write application output to a file, if the std-out must be written to a file, pipe it's output at program startup. For more flexibility for the fluentd configuration, pipe them to separate files (just like 'Apache' does):
My_program.exe Do some crazy stuf > my_out_file.txt 2> my_error_file.txt
This opens the option for fluentd to read from this/these file(s).
For Windows systems, use fluent-bit
, it likely solves the issue for aggregating the Windows OS program logs. Support for Windows has just been implemented recently.
fluent-bit supports:
The tail plugin can monitor a folder, this makes it practically possible to keep the configuration on the side of your program. Just make sure you write your logs of your different applications to a predictable directory.
For Linux, just use fluentd
(unless > 100000 messages per second are required, which is where fluent-bit becomes your only choice).
For Windows, install Fluent-bit, and make it run as a deamon (almost funny sollution).
There are 2 execution methods:
-c
flag.Some example executions (without making use of the option to work with a configuration file) can be found here:
PS .\bin\fluent-bit.exe -i winlog -p "channels=Setup,Windows PowerShell" -p "db=./test.db" -o stdout -m '*'
-i
declares the input method. Currently, only a few plugins have been implemented, see the man page below.
PS fluent-bit.exe --help
Available Options
-b --storage_path=PATH specify a storage buffering path
-c --config=FILE specify an optional configuration file
-f, --flush=SECONDS flush timeout in seconds (default: 5)
-F --filter=FILTER set a filter
-i, --input=INPUT set an input
-m, --match=MATCH set plugin match, same as '-p match=abc'
-o, --output=OUTPUT set an output
-p, --prop="A=B" set plugin configuration property
-R, --parser=FILE specify a parser configuration file
-e, --plugin=FILE load an external plugin (shared lib)
-l, --log_file=FILE write log info to a file
-t, --tag=TAG set plugin tag, same as '-p tag=abc'
-T, --sp-task=SQL define a stream processor task
-v, --verbose increase logging verbosity (default: info)
-s, --coro_stack_size Set coroutines stack size in bytes (default: 98302)
-q, --quiet quiet mode
-S, --sosreport support report for Enterprise customers
-V, --version show version number
-h, --help print this help
Inputs
tail Tail files
dummy Generate dummy data
statsd StatsD input plugin
winlog Windows Event Log
tcp TCP
forward Fluentd in-forward
random Random
Outputs
counter Records counter
datadog Send events to DataDog HTTP Event Collector
es Elasticsearch
file Generate log file
forward Forward (Fluentd protocol)
http HTTP Output
influxdb InfluxDB Time Series
null Throws away events
slack Send events to a Slack channel
splunk Send events to Splunk HTTP Event Collector
stackdriver Send events to Google Stackdriver Logging
stdout Prints events to STDOUT
tcp TCP Output
flowcounter FlowCounter
Filters
aws Add AWS Metadata
expect Validate expected keys and values
record_modifier modify record
rewrite_tag Rewrite records tags
throttle Throttle messages using sliding window algorithm
grep grep events by specified field values
kubernetes Filter to append Kubernetes metadata
parser Parse events
nest nest events by specified field values
modify modify records by applying rules
lua Lua Scripting Filter
stdout Filter events to STDOUT