I have service fabric with the following services:
I receive a lot of data from multiple clients and I want to insert it to a set of queues (Azure Service Bus), so that each of the services (in 2.) will process it after receiving it from the queue.
The problem is that I can't store messages above 250K in the queue.
What is the best practice to storage those 1MB chunks of data until they are processed and saved in Azure's storage account?
The naïve solution:
A single instance micro-service that holds the data in the state manager and only a reference will be saved in the queue.
This solution breaks the micro-service architecture, because it's not scalable.
Help please?
I agree with Mikhail that writing each request to blob storage from the Web API layer is the right solution. You would then queue a reference to the blob and your second tier service instances would dequeue those references and process each blob in turn.
This is a fairly common pattern in distributed systems... there is a cost to each 1 MB request to blob storage, but there's no free lunch. Unless you want to do the request processing right in the web tier you need to save the request data somewhere. It sounds like you've already decided against processing in the web tier... that's generally good advice but it depends on the nature of the processing, your anticipated request volume, VM capabilities, etc.
I don't love the idea of using stateful actors/services in Service Fabric to hold the 1 MB request payloads, for the simple fact that it will get expensive to scale your cluster as request count (and needed RAM) grows. Factor in reliable replication of that 1 MB state across the cluster (or eschewing replication and waiting for the inevitable problems) and this is almost certainly a bad idea, relative to other options.
Good luck!