Search code examples
cadence-workflow

Is it possible to switch a cadence-worfklow cluster to advanced visibliity (ElasticSearch) without downtime?


We have a cadence-workflow cluster with the visibility store on Cassandra. This does not support advanced visibility features, so we would like to move to ElasticSearch. However the cluster is in active production use, so we would like to make the switch without downtime. To clarify "without downtime":

  • Running workflows can complete once we start the switchover.
  • New workflows can be started while the switchover is in progress.
  • The basic visibility features are available both for workflows started before the switch, and for those started during the switchover. This is important both for debugging / troubleshooting purposes, and because for some of our code we uses the getResult calls (Java client) which I believe uses the visibility features on the server side.

Is this possible?


Solution

  • Yes, it's supported in Cadence. Below is the steps that you can use as references.

    1. Configure advanced visibility

    • Follow the instructions to setup visibility dependency and configuration
    • Keep the basic visibility dependency
    • Before enabling visibility, set the AdvancedVisibilityWritingMode dynamic config to do dual writing onto both advanced visibility(ES based) and basic visibility(db based), and only read from basic visibility for now:
    system.advancedVisibilityWritingMode:
      - value: "dual"
    system.enableReadVisibilityFromES:
      - value: false
    

    So that Cadence will write visibility to both stores, but continue to use basic visibility for read

    • Deploy the config

    2. Switching to advanced visibility

    You don't want to switch right away because the new store doesn't have enough data yet. Depends on the domain retention, or use case scenarios, you may switch to use read from advanced visibility later some time when you think it's okay to read only from advanced visibility.

    system.advancedVisibilityWritingMode:
      - value: "dual"
    system.enableReadVisibilityFromES:
      - value: false
      - value: true
        constraints:
          domainName: "samples-domain-A"
      - value: false
        constraints:
          domainName: "samples-domain-B"
    

    This will let samples-domain-A and samples-domain-B to read from advanced visibility, and the rest continue to use basic visibility

    • When you are confident that every domain is okay to use advanced visibility.
    system.advancedVisibilityWritingMode:
      - value: "dual"
    system.enableReadVisibilityFromES:
      - value: true
    

    3. Deprecate basic visibility

    When the previous steps are successful, you can deprecate the basic visibility. You can remove the db based visibility from config, and remove the dynamic config that used in above(when only advanced visibility is configured, Cadence will only write to and read from advanced visibility by default)