gitlab gitlab-ci lerna semantic-versioning conventional-commits

How to ensure Master and Dev branches are kept in sync when deploying from CI/CD with Semantic Versioning or Lerna Publish

Setup

I have several gitlab repos where the general setup involves having a master branch, a stage (pre-release) branch and a dev branch.

Push permissions are disabled for all 3 branches.

The workflow is to fork from the dev branch for any hot-fixes, bug fixes and features. When you are satisfied with the release you would submit a merge-request to dev. Eventually, when a stable build is ready inside dev; a merge-request would be submitted for the stage branch. Lastly, when you are satisfied with the pre-release you would submit a merge-request for the master branch.

I have CI/CD configured so that tests, builds and deployments are automatically executed from the master and stage branches with the automatic generation of CHANGELOG.md files. stage branch deploys to the UAT s3 Bucket and master deploys to the production s3 Bucket.

Deployment is handled through Semantic Versioning 2.0.0 which is responsible for bumping versions, generating changelogs and deploying.

I have a similar setup to the one just described above except it is a monorepo so I am using Lerna to handle the publishing (deploying) with {"conventionalCommits": true} to replicate Semantic Versioning 2.0.0's behaviour. I am using independent versioning inside the monorepo.

Both the Semantic Versioning 2.0.0 and the Lerna setup force the master branch to always be either behind or equal with the stage and dev branches; and the stage branch to always be behind or equal to the dev branch in kind of like a cascading effect.

dev >= stage >= master

The Problem

Both Lerna Publish and Semantic Versioning make several changes to the files when publishing/deploying. Some of these changes include updating the CHANGELOG.md file and bumping the version inside of the package.json file.

Both Lerna and Semantic Versioning eventually push these changes to the branch they are run from through the CI/CD.

What this means is that if I merge from dev to stage, stage would then have the bumped version numbers and new changelogs pushed into it through either the Semantic Versioning or Lerna Publish executions. This would cause the stage branch to be ahead of the dev branch and would cause all the future forks from the dev branch to detach from the stage branch meaning that the next time I merge from dev to stage it's not going to be a simple fast-forward merge like it's meant to be and the merge will most likely encounter conflicts which would prevent any future merges or may fail the CI/CD.

My Workaround

For Semantic Versioning:

I have disabled the push feature so that the new, changed files are no longer committed and pushed (tags are still created and pushed)
I have created a script that converts the CHANGELOG.md file to a PDF and sends it to my teams email

This works out well because Semantic Versioning uses tags to determine the changed files and decide how to bump the version. So, although the version inside the repo remains constant at 1.0.0 for example, Semantic Versioning is smart enough to increment the version from the latest tag not from what's in the package.json

This unfortunately doesn't hold true for Lerna which still uses tags to determine changed files but then bumps from the version inside package.json which means that by not pushing the updated package.json with the new version, Lerna always bumps me from 1.0.0 to either 1.0.1, 1.1.0, or 2.0.0

So I am stumped with Lerna.

Question

How should I be setting up my CI/CD to avoid the problem? I know the structure of my repo is common, and I haven't found anyone addressing this issue despite the countless users of Lerna and Semantic Versioning which tells me that I have obviously missed something as it is not a wide-spread issue.

Possible Solution

As I was writing this question, it crossed my mind that maybe I should only bump versions in dev and then deploy from stage and master. This would prevent stage and master from ever being ahead of dev, would this be the right way to do it?

Solution

Maintaining package version information in the repo, does not scale. Still, all the tools out there keep trying to make it work. I have nothing to offer in the way of an alternative (yet), other than to say that; release data should be managed by other means, than storing it in the source repo. What we're really talking about here are asynchronous processes, dev, build and release. The distributed nature of these systems, implies that we cannot treat repos as file shares and expect them to scale well.

See my other rant on this topic. I haven't had time to do a good write-up on this yet. I would add that Git tags are meant to be handy milestone markers for devs to find the right git hash to go back to, to create a fix branch from. Commit messages are meant for change-logs and just one of the inputs into deciding what version to release from which build.

No dev, working in a multi-dev, multi-branch, distributed environment, can possibly predict the appropriate semantic version to apply at some random point in the future. At least not without having full dictatorial control of every dev, branch and build/release environment. Even then, they would be hard pressed to make it work. It's that control, that current tooling implies, but in practice, it never scales.

Consider that your package feed service likely has all, or a sufficient portion of your release history. As long you have only one feed service, you can use that to determine the version floor for your next release. Process your semantic commits, lookup the most recent version matching the tag your process is targeting (daily, beta, RC, none or whatever), calculate the next appropriate version, update appropriate version fields in source code, then build and test. If the feed service doesn't let you include hidden or deleted versions in your query, you'll have to use your own database. Do not check-in the modified files! Those fields should be zeroed in your repo, effectively marking local dev builds as 0.0.-dev or something along those lines.

Automatic publishing of prerelease versions is okay, but there should be a human in the loop for release versions. Iff all the above steps succeed, apply a tag to the git hash that you just successfully built from.

My dream CI/CD system, runs every commit to my release branch(es) through a test build and unit test runs, detects whether existing tests case were modified (auto-detect breaking changes!), analyzes the commit messages for indications of intentional breakage, and presents all of that information to the release build system on demand. My release build system produces a -alpha.build.### and runs all of the acceptance tests against it.

If there is no known breakage and the intended target is a prerelease, it then updates the files containing version information and runs an incremental build with one final smoke test prior to automatic publication. Here's where there's some gray areas. Some prerelease targets should not include breakage, without human intervention, and for others it's okay. So I would have a special set of prerelease targets that do not allow automated publication of breakage, such as certain levels of dog-fooders, or bits targeted at my internal long running test infrastructure.

If it's an untagged release target, then I would prefer it to build and package everything for consumption in my final stage of testing. This is where the automation verifies the package for conformance to policies, ensures that it can be unpacked correctly, and gathers sign-offs from area owners, department/division heads, etc, prior to publication. It might include some randomized test deployments, in cases where we're targeting a live system.

And all of the above is just an over-simplified description really. It glosses over more than it clarifies, because there's just too much variation in the real-world needs of producers and consumers.

Circling back to tools and control. There are many in the DevOps world who will tell you that the main point is to standardize around tooling. Which remind me of this xkcd commic.