Search code examples
redisvector-database

Install redis vector database on GCP in a GKE cluster


I want to install Redis Vector DB in GCP running on GKE cluster. I am not sure on how to start the process and complete the installation.

I have seen the steps shared in langchain framework to install and run the redis vector locally using this command. docker run -d -p 6379:6379 -p 8001:8001 redis/redis-stack:latest

But I want to do the setup in a GKE cluster so that I can connect to the vector DB and store embeddings.

Please share any step-by-step approach that I can follow to perform this setup.


Solution

  • Based on your question, it seems like you are really new to using Redis as a Vector Database. For that reason, the first thing I suggest is install redis-stack-server on an instance outside of kubernetes before you attempt this in a kubernetes environment and verify connectivity of the new ACL feature available since redis 6. ACLs allow named users to be created and assigned fine-grained permissions. The advantage of ACLs is it limits certain connections in terms of the commands that can be executed and the keys that can be accessed. A client connecting is required to provide a username and password, and if authentication succeeds, the connection is associated with the given user and the limits the user has. From within the server you installed redis-stack-server, run the redis-cli utility in localhost and validate the default user exists:

    > ACL LIST
    "user default on nopass sanitize-payload ~* &* +@all”
    

    Here it indicates the default user is active, requires no password, has access to all keys and can access all commands. You can create your own user, and assign them permissions:

    > ACL SETUSER langchain on >secret allcommands allkeys 
    

    Here we create a user called langchain, set them as active, defined a password called secret, and gave them access all keys and commands.

    Note depending if you want to validate this connection eventually outside of a kubernetes cluster, you must understand that protected node is enabled by default, which means outside of the cluster in a different network, you will still face issues connecting. In Rocky Linux, that could be addressed by editing /etc/redis-stack.conf and protected-mode no. This will disable protected mode. Since version 3.2.0, Redis enters a protected mode when it is executed with the default configuration and without any password. This was designed as a preventative guardrail, thus only allowing replies to queries from the loopback interface.

    Now with access set up correctly, you can verify connectivity both from the cli and from the Python script directly. From cli:

    redis-cli -h [host] -p 6379 —user [user] —pass [password]
    

    From Python Script:

    import redis
    from dotenv import load_dotenv
    load_dotenv()
    
    redis_config = {
      'host': os.environ['REDIS_HOST'],
      'port': os.environ['REDIS_PORT'],
      'decode_responses': True,
      'health_check_interval': 30,
      'username': os.environ['REDIS_USERNAME'],
      'password': os.environ['REDIS_PASSWORD']
    }
    
    client = redis.Redis(**redis_config)
    res = client.ping()
    print(f'PING: {res}’)
    # >>> PING: True
    

    At this point, you should understand the authentication process of using redis-stack-server (which is what is layered in the Docker image redis/redis-stack:latest you referenced in your question). I want to draw your attention to what modules redis-stack-server loads. If you cat that some /etc/redis-stack.conf file, you will see the loaded modules:

    $ sudo cat /etc/redis-stack.conf 
    port 6379
    daemonize no
    protected-mode no
    loadmodule /opt/redis-stack/lib/rediscompat.so
    loadmodule /opt/redis-stack/lib/redisearch.so
    loadmodule /opt/redis-stack/lib/redistimeseries.so
    loadmodule /opt/redis-stack/lib/rejson.so
    loadmodule /opt/redis-stack/lib/redisbloom.so
    loadmodule /opt/redis-stack/lib/redisgears.so v8-plugin-path /opt/redis-stack/lib/libredisgears_v8_plugin.so
    

    I want to draw your attention specifically to redisearch. It is a module that extends Redis with vector similarity search features. If you check that module’s github page, notice it tags “vector-database”. What does it mean? You can use Redis’s support for Hierarchical Navigable Small World (HNSW) ANN or KNN (K Nearest Neighbor)) for vector embeddings.

    The bottom line is by installing Redis this way, at least first, you can get a deeper understanding of what it is installing, how what it is installing works, and how it functions as a vector store. Once you grasp those concepts, you can then decide to use your own or the one available in Docker Hub: https://hub.docker.com/r/redis/redis-stack. Either way, you end up with a Docker image that can be deployed in a Kubernetes cluster or preferably outside of the cluster (Kubernetes pods are stateless; StatefulSet is an alternative option). But I recommend keeping the database outside of k8 cluster (especially since this is the first time you tried this).

    Now, you would need to configure your application to be deployed in the Kubernetes cluster. In the context of Langchain, you typically use a document loader (e.g. from langchain.document_loaders.pdf import PyPDFLoader) to load the raw documents. Then you typically use a text splitter to break up the large documents in chunks with a specified chunk size, chunk overlap and separaters (e.g. from langchain.text_splitter import RecursiveCharacterTextSplitter). And then you load the embeddings model (e.g. from langchain.embeddings import HuggingFaceEmbeddings). In this example, I am using HuggingFace embeddings. Then you would use the Redis vectorstore that langchain provides:

    from langchain.document_loaders.pdf import PyPDFLoader
    from langchain.text_splitters import RecursiveCharacterTextSplitter
    from langchain.embeddings.huggingface import HuggingFaceEmbeddings
    from langchain_community.vectorstores.redis import Redis
    
    loader = PyPDFLoader(‘path-to-doc’)
    raw_documents = loader.load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=20, separaters=‘\n’)
    texts = text_splitter.split_documents(documents)
    rds = Redis.from_texts(
        texts,
        embeddings,
        metadatas=metadata,
        redis_url=os.environ['REDIS_URL'],
        index_name="users",
    )
    

    Hence, our application is using Redis as a vector store. Now build a Docker image out of this. Now this Python application should be part of your k8 cluster. It should be deployed in a Deployment with a ReplicaSet that defines the number of pods to run in the Deployment and specify readiness and liveness probes as well to health check the pods in your Deployment.

    At this point, with the knowledge you now have, you can deploy this in a GKE cluster. The Deploy an app in a container image to a GKE cluster goes through the specifics of taking the application (in our case the Python app), containerizing the app with Cloud Build, creating a GKE cluster, and deploying to GKE.