Hello have looked with intense interest lately to TPL Dataflow
and i want to integrate it in my ASP .NET Core
application.
I want to use it as a pipeline where multiple methods from different parts of application can post data to this DataFlow chain.
What i do not know is where do you store your link of blocks in case you want them to be called from multiple places?
Producer
public class Producer
{
private BufferBlock<int> startBlock{get;}
private ActionBlock<int> ioBlock{get;}
private IOService service;
private void InitializeChain()
{
this.startBlock=new BufferBlock<int>();
var transformLink=new TransformBlock<int,string>([something]);
// some chain of blocks here
this.ioBlock=new ActionBlock<int>(async(x)=>await this.service.WriteAsync(x));
this.startBlock.LinkTo([someBlock]).LinkTo([someOtherBlock])......LinkTo(ioBlock);
}
public async Task AddAsync(int data)
{
this.BufferBlock.Post(data);
}
public Producer(IOService service)
{
this.service=service;
this.InitializeChain();
}
}
API Producers
I am envisioning this Producer
getting called from multiple parts of my application , well use Controller
-s for brevity:
public class C1:Controller
{
private Producer producer;
[HttpPost]
[Route([someroute])
public async Task SomeRoute(int data)
{
await this.producer.AddAsync(data);
}
[HttpGet]
[Route([someotherroute])
public async Task SomeOtherRoute(int data)
{
await this.producer.AddAsync(data);
}
public C1(Producer producer)
{
this.producer=producer;
}
}
Startup
public void ConfigureServices(IServiceCollection services) {
services.AddSingleton<Producer>();
}
This can be extended to a multiple Controller
scenario or deeper in the hierarchy.
Now my question would be:
How should the Producer
that keeps the Dataflow
chain be injected ? Should it be transient ? Should the Blocks
be instantiated on every call ?
I do not know if this design is ok.I know TPL Dataflow is threadsafe , but can it be used this way?
P.S I basically do not know in what form to keep my Dataflow
pipeline and its lifetime , if i want it to be available per the entire scope of my ASP NET Core
application.
I want to fetch data from multiple endpoints (directly or deeper in the call hierarchy) , batch them ,transform them , and control the way they are in the end written to an external source (async
operation).
Does this play nice with the already existing ThreadPool
of ASP NET Core
?
P.S 2: This question also haunts me for an Rx
equivalent.
I recommend not directly linking your controllers to your background processor. For reliability reasons, there should be a persistent queue in between them. This can be an Azure Queue, Amazon Simple Queue, or even something old school like MSMQ or a database.
Your processor can be independent (Azure Function, Amazon Lambda, or old school like Win32 service), or it can be part of your web app (ASP.NET Core hosted service).
Your controller writes to the persistent queue and then returns. Your processor then reads messages from the queue and processes them. Your processor is what would use TPL Dataflow or Rx - whichever is more natural.