Search code examples
architecturemicroservicesgitlab-cicontinuous-deploymentblue-green-deployment

blue/green deployment to portainer using gitlab CI/CD


I have webservice using websockets, and need to implement zero-downtime deployment. Because I don't want drop existing connections on deploy, I've decided to implement blue/green deploy. My actual solution looks like:

  1. I've created two identical services in portainer, listening on different ports. Every service has set in node environments some identifier, for example alfa and beta
  2. Both services are hidden behind load balancer, and balancer is periodically checking status of each service. If service responds on specific route (/balancer-keepalive-check) with string "OK", this service is active and balancer can routing to this service. If service is responding with string "STOP", balancer mark this service as inaccessible, but active connections will be preserved
  3. which service is active and which is stopped is synced over redis. In redis there are keys lb.service.alfa and lb.service.beta which can contains values 1 for active and 0 for inactive. Example of implementation /balancer-keepalive-check route in nestjs:
    import {Controller, Get} from '@nestjs/common';
    import {RedisClient} from "redis";
    const { promisify } = require("util");
    
    
    @Controller()
    export class AppController {
    
        private redisClient = new RedisClient({host: process.env.REDIS_HOST});
        private serviceId:string = process.env.ID;  //alfa, beta
    
        @Get('balancer-keepalive-check')
        async balancerCheckAlive(): Promise<string> {
            const getAsync = promisify(this.redisClient.get).bind(this.redisClient);
            return getAsync(`lb-status-${this.serviceId}`).then(status => {
                const reply: string = status == 1 ? 'OK' : 'STOP';
                return `<response>${reply}</response>`;
            })
        }
    }
  1. in gitlab CI create docker image tagged by tag on commit, and restart service calling portainer webhook for specific service. This works well for 1 service, but don't know how to use 2 different DEPLOY_WEBHOOK CI variables and switch between them.
image: registry.rassk.work/pokec/pokec-nodejs-build-image:p1.0.1
services:
  - name: docker:dind

variables:
  DOCKER_TAG: platform-websocket:$CI_COMMIT_TAG

deploy:
  tags:
    - dtm-builder
  environment:
    name: $CI_COMMIT_TAG
  script:
    - npm set registry http://some-private-npm-registry-url.sk
    - if [ "$ENV_CONFIG" ]; then cp $ENV_CONFIG $PWD/.env; fi
    - if [ "$PRIVATE_KEY" ]; then cp $PRIVATE_KEY $PWD/privateKey.pem; fi
    - if [ "$PUBLIC_KEY" ]; then cp $PUBLIC_KEY $PWD/publicKey.pem; fi
    - docker build -t $DOCKER_TAG .
    - docker tag $DOCKER_TAG registry.rassk.work/community/$DOCKER_TAG
    - docker push registry.rassk.work/community/$DOCKER_TAG
    - curl --request POST $DEPLOY_WEBHOOK
  only:
    - tags

My questions, which I don't know how to solve are:

  • When I have 2 services, I have 2 different deploy webhooks from which I need to call one after deploy, because I don't want to restart both services. How to determine which one? How to implement some kind of counter, if this deploy is to "alfa" or "beta" service? Should I use gitlab api and update DEPLOY_WEBHOOK after each deploy? Or shoud I get rid of this gitlab CI/CD variable and use some API on services which will tell me webhook url?
  • How to update values in redis? Should I implement custom API for this?
  • Exists there better way how to achieve this?

addition info: Can't use gitlab api from serviceses, because our gitlab is self-hosted on domain accessible only from our private network.


Solution

  • I've modified my AppController. There are 2 new endpoints now, one for identify which service is running, second for switch value in redis:

    private serviceId:string = process.env.ID || 'alfa';
    
        @Get('running-service-id')
        info(){
            return this.serviceId
        }
    
        @Get('switch')
        switch(){
            const play = this.serviceId == 'alfa' ? `lb-status-beta` : `lb-status-alfa`;
            const stop = `lb-status-${this.serviceId}`;
            this.redisClient.set(play, '1', (err) => {
                if(!err){
                    this.redisClient.set(stop, '0');
                }
            })
        }
    

    after that, I modified my gitlab-ci.yml as follows:

    image: registry.rassk.work/pokec/pokec-nodejs-build-image:p1.0.1
    services:
      - name: docker:dind
    
    stages:
      - build
      - deploy
      - switch
    
    variables:
      DOCKER_TAG: platform-websocket:$CI_COMMIT_TAG
    
    test:
      stage: build
      allow_failure: true
      tags:
        - dtm-builder
      script:
        - npm set registry http://some-private-npm-registry-url.sk
        - npm install
        - npm run test
    
    build:
      stage: build
      tags:
        - dtm-builder
      environment:
        name: $CI_COMMIT_TAG
      script:
        - if [ "$ENV_CONFIG" ]; then cp $ENV_CONFIG $PWD/.env; fi
        - if [ "$PRIVATE_KEY" ]; then cp $PRIVATE_KEY $PWD/privateKey.pem; fi
        - if [ "$PUBLIC_KEY" ]; then cp $PUBLIC_KEY $PWD/publicKey.pem; fi
        - docker build -t $DOCKER_TAG .
        - docker tag $DOCKER_TAG registry.rassk.work/community/$DOCKER_TAG
        - docker push registry.rassk.work/community/$DOCKER_TAG
      only:
        - tags
    
    deploy:
      stage: deploy
      needs: [build, test]
      environment:
        name: $CI_COMMIT_TAG
      script:
        - 'SERVICE_RUNNING=$(curl --request GET http://172.17.101.125/running-service-id)'
        - echo $SERVICE_RUNNING
        - if [ "$SERVICE_RUNNING" == "1" ]; then curl --request POST $DEPLOY_WEBHOOK_2; fi
        - if [ "$SERVICE_RUNNING" == "2" ]; then curl --request POST $DEPLOY_WEBHOOK_1; fi
      only:
        - tags
    
    switch:
      stage: switch
      needs: [deploy]
      environment:
        name: $CI_COMMIT_TAG
      script:
        - sleep 10
        - curl --request GET http://172.17.101.125/switch
      only:
        - tags
    
    

    In job build the docker image is build. After that runs job deploy, which make request to /running-service-id and identifies, which service is runing. Then deploy image to the stopped service. Last one is job switch, which will make request to /switch route, that will switch values in redis.

    This works well. Last thing I need to implement is some kind of secret for this two routes (jwt for example)