I'm trying to determine the best design for my REST API. I have a number of steps that happen sequentially (data manipulation pipeline) that each have individual REST calls.
I have been saving the intermediate data generated into files on the server to keep the REST API stateless. However, I'm worried about the costs associated with File I/O and serialization since I write out to 3-4 files and then read them again, later on in the pipeline.
An alternative is to save them in memory during the run of the Java web app, but that seems to make the system stateful. What are pros/cons of these options?
Saving intermediate results anywhere makes your services stateful. How you store state has a big impact on scalability. Once your services are running on multiple instances, the state information has to be shared among them. If you use files, then you must use a file server accessible to all of them. Another option is a database.
You could also consider passing the intermediate data to the client; the client would pass the data back on the next call.