I have a large CSV file that I want to proccess with a rate limit. The Splitter Pattern provides exactly what I'm looking for except I can't quite figure out how out how to combine it with the CSV Component.
From the splitter documentation you can handle CSV's like:
from("file:inbox")
.split().tokenize("\n", 1000).streaming()
.to("activemq:queue:order");
But idealy I'd like to make use of the Apache Camel CSV component to handle the mash do something more like:
from("file:inbox")
.unmarshal().csv().split()
.streaming().parallelProcessing()
.throttle(requestsPerSecond)
.bean(new ValidateProcess(), "validate")
.marshal().csv().to("file:outbox");
I know the code above is completely wrong but hopefully it conveys what I'm trying to achieve. Would this be at all feasible?
So for some reason I couldn't figure this out earlier, I think I was having issue with my classpath not picking up the dependency on org.apache.camel:camel-csv
. Once I sorted that out everything was fine.
Here's what I ended up with:
final CsvDataFormat csv = new CsvDataFormat(";");
csv.setLazyLoad(true);
csv.setSkipFirstLine(true);
from(in).unmarshal(csv).split(body()).streaming().parallelProcessing()
.bean(validator, "validateNumber")
.filter(header(ValidateProcess.Valid).isEqualTo(true))
.throttle(tps).bean(validator, "validate")
.marshal().csv()
.to(out).log("done.").end();
Basicaly I wanted to stream process a CSV
containing numbers against an API
which is rate limited at 50 TPS and output the result to csv file.