We do something called feature testing like so -> https://blog.twitter.com/engineering/en_us/topics/insights/2017/the-testing-renaissance.html
TLDR of that article, we send request to microservice(REST POST with body), mock GCP Storage, mock downstream api call so the entire microservice can be refactored. Also, we can swap out our platforms/libs with no changes in our testing which makes us extremely agile.
My first questions is can DataFlow (apache beam) receive a REST request to trigger the job? I see much of the api is around 'create job' but I don't see 'execute job' in the docs while I do see get status returns the status of job execution. I just don't see a way to trigger a job to
Then, I simply want to in my test simulate the http call, then when file is read, return a real customer file and then after done, my test will verify all the correct requests were sent to the apis downstream.
We are using apache beam in our feature tests though not sure if it's the same version as google's dataflow :( as that would be the most ideal!!! -> hmmm, is there a reported apache beam version of google's dataflow we can get?
thanks, Dean
thanks, Dean
Apache Beam's DirectRunner should be very close to Dataflow's environment, and it's what we recommend for this type of single-process pipeline test.
My advise would be the same: Use the DirectRunner for your feature tests.
You can also use the Dataflow runner, but that sounds like it would be a full integration test. Depending on the data source / data sink, you may be able to pass it mocking utilities.
BigQueryIO is a good example. It has a withTestServices
method that you can use to pass objects that mock the behavior of external services