Search code examples
gitlabgitlab-cigitlab-pages

How does Gitlab's "pages" job work internally?


I have a Gitlab project like this (.gitlab-ci.yml):

# Sub-jobs listed as comments
stages:
  - check-in-tests
      # shellcheck
      # pylint
      # unit-tests
  - several-other-things
      # foo
      # bar
      # baz
  - release
      # release

# Run some shell code static tests and generate logs/badges
shellcheck:
  stage: check-in-tests
  script:
    - bash run_shellcheck.sh
  artifacts:
    paths:
      - logs/shellcheck.log
      - logs/shellcheck.svg

# Run some python code static tests and generate logs/badges
pylint:
  stage: check-in-tests
  script:
    - bash run_pylint.sh
  artifacts:
    paths:
      - logs/pylint.log
      - logs/pylint.svg

# <snip>

On my project page I'd like to render the .svg files produced during check-in-tests as badges.

The Gitlab badges tool requires a URL to an image file. It is incapable of loading images from URLs with query strings. Unfortunately, the syntax for accessing specific job artifacts ends in a query string. This effectively means that we can't link to job artifacts as badges.

The most popular workaround is to abuse Gitlab's pages feature to store artifacts as static content. From there we can get clean URLs to our artifacts that don't contain query strings.

My confusion involves the underlying mechanism behind the "pages" job defined in .gitlab-ci.yml. The official documentation here is very sparse. There are a million examples for deploying an actual webpage with various frameworks, but I'm not interested in any of them since I'm just using my project's "page" for file hosting.

The assumption seems to be that I want to deploy my page at the end of the pipeline. However, I want to upload the shellcheck and pylint artifacts near the beginning of the pipeline. Furthermore, I want those artifacts to be uploaded even if the pipeline stages fail.

Syntactically the pages job looks identical to any other job. There's nothing there to describe how it's magically picked up by Gitlab's internals. This leaves me with the following questions:

  • Can I change the stage from "deploy" to "check-in-tests", or is the deploy stage specifically part of the hidden magic that Gitlab looks for when parsing for a pages job?
  • If I'm tied to the deploy stage, can I re-arrange the stages to make it come earlier in the pipeline without breaking the magic?
  • Does the pages job deploy artifacts from the local machine (default behavior for a job), or are the listed paths coming from artifacts which have already been uploade to the Gitlab pipeline by earlier jobs?
  • If the pages job is only looking for artifacts locally how can I ensure that it runs on the same machine as the earlier jobs so that it finds the artifacts which they produced? Let's assume that the Gitlab executors all come from a pool with the same tag and aren't tagged individually.
  • Is there any chance of getting the pages job to run within the same Docker container that originally produced the artifacts?

Solution

  • The magic around GitLab pages is in the name of the job. It has to be named "pages", and nothing else. It is possible to move the job to different stages. As soon as the job "pages" has finished successfully, there's a special type of job that is called "pages:deploy". This job is shown in the deploy stage even if you change the stage that the "pages" job is run in.

    If you have the pages job in an early stage, jobs in the later stages can fail but the "pages:deploy" job will still run and update GitLab pages.

    Other than that, the "pages" job is just like a normal job in GitLab. If you need artifacts from other jobs, you can get these by using artifacts and dependencies:

    https://docs.gitlab.com/ee/ci/yaml/#dependencies

    The "pages" job should create a folder named "public" and give that folder as an artifact.