logging azure-blob-storage pipeline fluentd fluent-bit

How to configure Fluentbit / Fluentd input from Azure Blob Storage? What input type?

We're currently collecting IIS logs using the tail input and shipping them to New Relic using fluentbit.

fluent-bit.conf

[SERVICE]
   Flush         1
   Log_File      C:\inetpub\logs\LogFiles\W3SVC1\*.log
   Parsers_File  C:\Program Files\New Relic\newrelic-infra\parsers.conf

[INPUT]
   Name        tail
   Path        C:\inetpub\logs\LogFiles\W3SVC1\*.log
   Parser      iis_logs_parser
   Mem_Buf_Limit     1000MB

[OUTPUT]
   name      nrlogs
   match     *
   api_key   {{NewRelicKey}}

Now we'd like to collect another source of logs that we can access in Azure Blob Storage. We'd like to use fluentbit so that the parsing of both data sources is done the same-ish way, ensuring the collected fields are the same, extending them just with a source. That way we can process/visualise both data sources almost the same way.

How to configure fluentbit to read logs from Azure Blob Storage? What fluentbit input are we looking for?

Solution

These are supported fluentbit inputs: https://docs.fluentbit.io/manual/pipeline/inputs

There's no support for Azure Blob Storage, nor as for Amazon S3. FluentBit was designed as a light-weight/embedded log collector thus its inputs backlog prioritized accordingly. All the heavy-lifting usually handled by fluentd.

I also checked in fluentd - there are couple plugins for Azure blob storage but couldn't find the one supporting input (The S3 one supports both input/output). Looks like the solution will be an azure function triggered by a storage event, read the file and send data further.

Local logs    -> FluentBit      -TCP-> fluentd-server -> destinations
Azure storage -> Azure function -TCP-> fluentd-server -> destinations