Search code examples
gitgitlabgitlab-ciaccess-tokengit-submodules

git submodule update with Access Token inconsistent behavior


Context

On a self-managed GitLab instance, with multiple users and groups, I'm trying to do a git submodule update --init --recursive in the .gitlab-ci.yml of main_project. This repo contains a submodule (filter_lib), itself containing a submodule (helper_funcs):

main_project
├── app
│   └── filter_lib                    <- submodule
│       ├── .gitmodules
│       ├── lib
│       └── helper_funcs              <- submodule
│           └── funcs
├── .gitmodules
├── .gitlab-ci.yml
├── .gi
└── tests
    └── test_stuff.py

main_project is in one GitLab group (let's call it group1) and both submodules (filter_lib and helper_funcs) are in another GitLab group and subgroup (group2/subgroupA), with no acces right to each other:

my_gitlab_instance
├── group1
│   └── main_project
└── group2
    └── subgroupA
        ├── filter_lib
        └── helper_funcs

Problem

I want to init all submodules.
First, I tried with this code at the beginning of my .gitlab-ci.yml:

variables:
  GIT_SUBMODULE_STRATEGY: recursive

This CI failed with the following error before attempting to run my scripts:

Updating/initializing submodules recursively with git depth set to 50...
Submodule 'app/filter_lib' (https://gitlab-ci-token:[MASKED]@my_gitlab_instance.com/group2/subgroupA/filter_lib.git) registered for path 'app/filter_lib'
Cloning into '/builds/group1/main_project/app/filter_lib'...
Submodule path 'app/filter_lib': checked out '28d6c0f2d0bc691c29a406f44ae9b69b4e00f2b2'
Submodule 'helper_funcs' (git@gitlab:group2/subgroupA/helper_funcs) registered for path 'app/filter_lib/helper_funcs'
Cloning into '/builds/group1/main_project/app/filter_lib/helper_funcs'...
error: cannot run ssh: No such file or directory
fatal: unable to fork
fatal: clone of 'git@gitlab:group2/subgroupA/helper_funcs' into submodule path '/builds/group1/main_project/app/filter_lib/helper_funcs' failed
Failed to clone 'helper_funcs'. Retry scheduled
Cloning into '/builds/group1/main_project/app/filter_lib/helper_funcs'...
error: cannot run ssh: No such file or directory
fatal: unable to fork
fatal: clone of 'git@gitlab:group2/subgroupA/helper_funcs' into submodule path '/builds/group1/main_project/app/filter_lib/helper_funcs' failed
Failed to clone 'helper_funcs' a second time, aborting
Failed to recurse into submodule path 'app/filter_lib'

It is kind of expected because group1/main_project doesn’t have read rights to any repo in group2.

So I tried another way, by changing the GIT_SUBMODULE_STRATEGY to normal and allowing group1/main_project to access group2/subgroupA/filter_lib and group2/subgroupA/helper_funcs the following way:

For filter_lib, I went into the repo Settings > Access Tokens and generated a token with all available scopes and the Maintainer role. I then added this token in main_project > Settings > CI/CD > Variables as a masked variable named FILTER_LIB_CLONE_KEY. I did the same for helper_funcs, with the variable named HELPER_FUNCS_CLONE_KEY.

Please note all the following commands were executed through the .gitlab-ci.yml of main_project.

I then sed the .gitmodules of main_project before attempting to git submodule update, so that it looked like this during the CI stage:

$ cat .gitmodules
[submodule "app/filter_lib"]
    path = app/filter_lib
    url = https://gitlab-ci-token:[MASKED(FILTER_LIB_CLONE_KEY)]@my_gitlab_instance.com/group2/subgroupA/filter_lib.git

Running git submodule update --init in main_project successfully cloned the content of group2/subgroupA/filter_lib:

$ cd app/filter_lib
$ ls -al
total 23
drwxrwxrwx    4 root     root          4096 May 17 10:51 .
drwxrwxrwx    3 root     root          4096 May 17 09:24 ..
-rw-rw-rw-    1 root     root            40 May 17 09:24 .git
-rw-rw-rw-    1 root     root           137 May 17 10:51 .gitmodules
drwxrwxrwx    2 root     root          4096 May 17 10:52 helper_funcs
drwxrwxrwx    6 root     root          4096 May 17 09:24 lib

I did the same for app/filter_lib/.gitmodules, which looked like this during the CI after the sed:

$ cat app/filter_lib/.gitmodules
[submodule "helper_funcs"]
    path = helper_funcs
    url = https://gitlab-ci-token:[MASKED(HELPER_FUNCS_CLONE_KEY)]@my_gitlab_instance.com/group2/subgroupA/helper_funcs.git
    ignore = dirty

In filter_lib, I then did:

$ git submodule update
Host key verification failed.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
fatal: clone of 'git@gitlab:group2/subgroupA/helper_funcs' into submodule path '/builds/group1/main_project/app/filter_lib/helper_funcs' failed
Failed to clone 'helper_funcs'. Retry scheduled
Host key verification failed.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
fatal: clone of 'git@gitlab:group2/subgroupA/helper_funcs' into submodule path '/builds/group1/main_project/app/filter_lib/helper_funcs' failed
Failed to clone 'helper_funcs' a second time, aborting

However, manually cloning the helper_funcs repo at the right place with HELPER_FUNCS_CLONE_KEY works.

Why is it possible to git clone but not to git submodule update with the same repo url?

Why does the git submodule update works on the first submodule but not on the second, even though access rights are the same?


Solution

  • As @torek tells in the comments, git is still using an authentication via ssh and not https as you want.

    I had the same problem. In my case, I was installing openssh inside my job. Not installing openssh was the way to go for git to start using https.

    More precisely, my install command went from:

        - apk add gcc linux-headers musl-dev git openssh
    

    to

        - apk add gcc linux-headers musl-dev git
    

    Still, I am not sure as to why git uses the ssh method over the https one when openssh is installed.