How to setup a multi-node Weaviate via docker-compose if you are using non-external API vectorizers?

We are currently exploring Weaviate for our system to introduce vector search but would not want to use external API vectorizers such as text2vec-openai or text2vec-huggingface.

We just wanted to use Weaviate's pre-built transformers model container sentence-transformers-multi-qa-MiniLM-L6-cos-v1 in our setup. In their documentation, I just need to do something like this in our docker-compose.yml:

version: '3.4'
services:
  weaviate:
    image: semitechnologies/weaviate:1.18.3
    environment:
      DEFAULT_VECTORIZER_MODULE: text2vec-transformers
      ENABLE_MODULES: text2vec-transformers,backup-s3
      TRANSFORMERS_INFERENCE_API: http://t2v-transformers:8080
      CLUSTER_HOSTNAME: 'node1'
      # Remove other stuffs here for brevity...
      
  t2v-transformers:
    image: semitechnologies/transformers-inference:sentence-transformers-multi-qa-MiniLM-L6-cos-v1
    environment:
      ENABLE_CUDA: 1 # GPU-enabled

And just for higher availability, we will do a two-node setup which is clearly documented here.

However, the example is using no vectorizer module thus, how do we setup a multi-node Weaviate setup where we also have a higher availability for our vectorizer modules?

Solution

Just take the multi-node docker compose file, add the t2v-transformers service and for each weaviate node, add the related env vars to enable the modules you want e.g.

DEFAULT_VECTORIZER_MODULE: text2vec-transformers
ENABLE_MODULES: text2vec-transformers,backup-s3
TRANSFORMERS_INFERENCE_API: http://t2v-transformers:8080