I am writing a memgraph transformation in python.
When I import modules such as "requests" or "networkx", the transformation works as expected.
I have avro data w/ schema registry, so I need to deserialize it. I followed the memgraph example here: https://memgraph.com/docs/memgraph/2.3.0/import-data/kafka/avro#deserialization
When I save the transformation with those imports, I receive the error:
[Error] Unable to load module "/memgraph/internal_modules/test_try_plz.py";
Traceback (most recent call last): File "/memgraph/internal_modules/test_try_plz.py", line 4,
in <module> from confluent_kafka.schema_registry import SchemaRegistryClient ModuleNotFoundError:
No module named 'confluent_kafka' . For more details, visit https://memgr.ph/modules.
How can I update my transform or memgraph instance to include the confluent_kafka module?
The link provided in the answer did not provide any leads, at least to me.
You cannot add python dependencies to memgraph in memgraph cloud (free trial at least..)
Instead, create your own docker image and use that, e.g.,
FROM memgraph/memgraph:2.5.2
USER root
# Install Python
RUN apt-get update && apt-get install -y \
python3-pip \
python3-setuptools \
python3-dev \
&& pip3 install -U pip
# Install pip packages
COPY requirements.txt ./
RUN pip3 install -r requirements.txt
# Copy the local query modules
COPY transformations/ /usr/lib/memgraph/query_modules
COPY procedures/ /usr/lib/memgraph/query_modules
USER memgraph
And my requirements.txt, all of which are required for a transformation leveraging the confluent schema-registry/avro packages:
confluent_kafka==2.0.2
fastavro==1.7.1
requests==2.28.2