Suppose that I have a primary repo y
with some submodule, say at sub/x
.
Suppose also that, for both the primary and the submodule repos, master
is the active branch, and that the .gitmodules
file of the primary repo specifies branch = master
.
Now, suppose that, in addition to its master
branch, the primary (y
) repo has a branch yA
, and likewise, in addition to its master
branch, the submodule repo (x
) has a branch xA
.
I would like the yA
branch of the y
repo to "see"/use the xA
branch of the x
repo.
This would mean that switching between the master
and yA
branches in the primary repo would cause the corresponding switch between the master
and xA
branches in the submodule.
Does git have any support for this?
I tried the following:
yA
branch on the primary repo;xA
branch on the submodule repo;master
with xA
as the value for the branch parameter in the primary repo's .gitmodules
file;This did not work as I had hoped: if I switch to the master
branch on the primary repo, this has no effect on the active branch setting of the submodule repo (and therefore, the master
branch does not have a clean status).
Does git have any support for [what I want]?
Sort of, I think. You will need to make sure that .git/modules
(in the superproject) does not acquire a setting.
Use different settings in .gitmodules
(in the superproject commits), and use git submodule update --remote
as needed. I have not tested this, but see the long description.
My overall general advice: the branch
setting of a submodule is mostly useless and irrelevant. Just ignore it. We'll get to the mostly part in a bit, though, and you can see if you can use it.
A submodule is defined as a Git repository in which some other Git will, on occasion, enter into that submodule and run some Git command. The other Git is called the superproject.
The superproject Git's main operation is:
(cd $path && git checkout $hash)
Nowhere in this sequence does any branch name occur. That's why the branch
setting is irrelevant.
The $path
and $hash
parts come from the superproject Git's index, and they got there by being extracted from a commit in the superproject. That commit recorded the path of the submodule, and a raw hash ID. No branch name occurs here either.
When you run git checkout
or git switch
in the superproject, to select some branch name and therefore some particular commit, the superproject Git extracts that commit to its (the superproject's) index and to your work-tree for that superproject. This puts the correct ($path, $hash) pair into the superproject's index.
Unfortunately, it does not invoke the $(cd $path && git checkout $hash)
part by default, to update the submodule. To make it do so, you have several options:
git submodule update
. This command does exactly that (well, by default anyway: see details below).git checkout --recurse-submodules
(or the same flag for git switch
). This command makes git checkout
run the update, and propagates into the submodule Git, so that when that submodule runs git checkout
(or git switch
), if the submodule is a superproject for another submodule, that submodule will, in its superproject role, invoke the update. This will repeat (recursively) for all nested submodules. (I generally don't use this but I have not had to deal with recursive submodules much. It's quite powerful, because of the recursion.)submodule.recurse
to true. This enables the --recurse-submodules
option on multiple commands, including checkout/switch, but also on git fetch
and git pull
. (I dislike this one: I think it's too powerful. However, you can set it, and then explicitly disable recursive push with the push.recurseSubmodules
setting.)branch
setting matter?The git submodule
documentation has several long and fairly impenetrable paragraphs to describe the git submodule update
sub-command. (I believe this indicates a flaw in the overall setup of submodules, but we must work with what we have, at least until we can come up with something better.) Let me quote from it here:
update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--] [<path>...]
Update the registered submodules to match what the superproject expects by cloning missing submodules, fetching missing commits in submodules and updating the working tree of the submodules. The "updating" can be done in several ways depending on command line options and the value ofsubmodule.<name>.update
configuration variable. ...
As you can see, there are many options. To keep this answer from getting even longer, let's concentrate on just three of them: --checkout
, --rebase
, and --merge
. There are two more that aren't options but that you can set with the submodule.name.update
variable, which we'll ignore here. These options—--checkout
, --rebase
, and --merge
—set which kind of action the update will use, which is the same as the option name without the leading double hyphen.
The checkout
mode is the default default. That is, if you have not set an explicit submodule.name.update
setting, and you don't specify --rebase
or --merge
, you get checkout
. So that's what everyone uses—mostly! So that's what the word mostly is doing in the overall general advice at the top of this answer.
Now, on to the three modes. I'll quote from the documentation again, with some minor formatting changes and commentary afterward:
checkout
the commit recorded in the superproject will be checked out in the submodule on a detached HEAD.
rebase
the current branch of the submodule will be rebased onto the commit recorded in the superproject.
merge
the commit recorded in the superproject will be merged into the current branch in the submodule.
So, with the default mode, no branch name enters the picture anywhere. Only the rebase
and merge
modes actually make use of a branch name. So now we get to ask the question: which branch name?
The documentation makes it clear: the current branch in the submodule. That's not the branch =
setting of the submodule; it's the current branch in the submodule.
But what branch is current, in the submodule? You can find out, if you like:
(cd $path && git rev-parse --abbrev-ref HEAD)
will tell you, for each path you pass in, what branch if any is current. It prints HEAD
if the submodule is using detached-HEAD mode, as it will be if you've run git submodule update --checkout
, or any git submodule update
that uses checkout
mode.
If you were to predict the current branch, or whether the submodule is on a detached HEAD and therefore on any branch at all, what would you predict? Well, have you run git submodule update
? You had to do a git submodule update --init
initially, unless you did a recursive mode checkout, in which case Git did a git submodule update --init --checkout
for you. So chances are that your submodule is in detached-HEAD mode, and therefore has no current branch.
We're still a bit at sea, in other words. How do we get the submodule Git to be on a branch in the first place?
There's one simple and obvious method: we can do our own (cd $path; git checkout $branch)
where we provide the $path
and $branch
ourselves. That way, the submodule is on the branch we want, whatever commit that is. But since we're providing $branch
, we don't need a setting. We just do:
(cd path/to/submodule; git checkout feature/foo)
directly. So that's not it either.
If we scroll down to the OPTIONS section in the documentation, and then scroll further down to the --remote
option, we finally find the one place where the setting is actually used:
--remote
This option is only valid for the update command. Instead of using the superproject’s recorded SHA-1 to update the submodule, use the status of the submodule’s remote-tracking branch. The remote used is branch’s remote (branch.<name>.remote
), defaulting toorigin
. The remote branch used defaults to the remoteHEAD
, but the branch name may be overridden by setting thesubmodule.<name>.branch
option in either.gitmodules
or.git/config
(with.git/config
taking precedence).This works for any of the supported update procedures (
--checkout
,--rebase
, etc.). The only change is the source of the target SHA-1. For example,submodule update --remote --merge
will merge upstream submodule changes into the submodules, whilesubmodule update --merge
will merge superproject gitlink changes into the submodules.
Seriously, this text is really hard to read—but what it says is that git submodule update --remote
won't just use the raw SHA-1 hash ID from the superproject. Instead, it will use a raw SHA-1 hash ID it gets from somewhere else. Where, precisely, is the somewhere else?
In order to ensure a current tracking branch state,
update --remote
fetches the submodule’s remote repository before calculating the SHA-1. If you don’t want to fetch, you should usesubmodule update --remote --no-fetch
.
So: when you use --remote
with your git submodule update
command, the superproject will:
(cd $path; git fetch)
, unless you add --no-fetch
(cd $path; git rev-parse $(complicated))
to get a hash ID.The $(complicated)
part is complicated, but it grabs the branch name from the branch =
setting, e.g., branch = master
, from either .gitmodules
or .git/config
. It turns this into the remote-tracking name, such as origin/master
, that step 1 will have just updated. See also VonC's answer to How can I specify a branch/tag when adding a Git submodule?.
The special name .
means use the branch name in the superproject—but:
I would like the
yA
branch of they
repo to "see"/use thexA
branch of thex
repo.
Unless the spellings match exactly, you can't get this with the .
trick. And, if the submodule's branch name has been copied into the superproject's .git/config
, it will stay set to whatever it is set to, but if not, the superproject Git will read the branch =
setting from the .gitmodules
file.
If the .gitmodules
file committed in the primary repository commit $SHA_YA
as recorded in branch name yA
says branch = xA
, then, at the time you run git submodule update --remote
(with or without --no-fetch
), the superproject Git should do a git rev-parse
on origin/xA
, assuming submodule x
has origin
as its remote here. That will become the source of the raw hash ID that superproject y
will pass to submodule x
when superproject y
runs (cd x; git checkout $hash)
.
When you switch to some other commit—note that the branch name is not relevant here; what matters is the commit hash ID, and the .gitmodules
file that is part of that commit—in the superproject, the .gitmodules
file in the superproject can have some other branch =
setting. Your git submodule update --remote
command will find that setting, and have the submodule Git do a different git rev-parse
to get the hash ID to pass to the submodule Git when the superproject tells the submodule what to check out.
It is all very complicated, with a lot of moving parts. These parts must all line up at the right time. The superproject is ultimately really just using raw hash IDs. It's less head-ache-invoking to just use the right raw hash IDs. Once they're in a commit, they cannot be changed, and that's normally the right thing, so you just have to make sure they're correct before you commit.