Search code examples
pythongoogle-cloud-platformgoogle-cloud-dataflowdataflow

python dataflow job not accepting messages from a pubsub subcription after using requirements_file parameter after while deploying


I want to use the dataflow job to encrypt the coming message from pubsub subscription before writing to a big query. I am using pycryptodome==3.9.8, cryptography==3.1 python library to do that.

In the dataflow job, I am using below two imports

from Crypto import Random from Crypto.Cipher import AES

When I try to deploy dataflow pipeline without --requirements_file parameter. It deploys perfectly, but after publishing a message to a topic it throws an error

ModuleNotFoundError: No module named 'Crypto' [while running 'generatedPtransform-81']

After that, I tried to deploy the pipeline again with --requirements_file requirement.txt flag. The dataflow pipeline deploys okay, but now it is not accepting any messages from subscriptions. There is no error in the dataflow job as it did not fetch the message.

Am I missing something? As there is no log to it, its very difficult to identify.


Solution

  • Re-posting comment by @peter-kim as an answer: Use a setup.py file and you should be able to do what you need. See Dataflow fails when I add requirements.txt [Python]