How to selectively not merge when pulling from github repo

Let's say I have cloned a github repo to my computer, have changed some of these files locally, but when I do a new git pull do not want to merge in any way with the github repo's version of those files. That is, I want my changes to those local files not in any way to be superseded or changed by the repo's versions -- although I do want to keep pulling to get any new files from the repo. Another way to say this might be, How can you pick which files to merge and which not to merge? I know if I change a file on my end I should do a commit... Beyond that I'm lost.

Solution

While matt's answer may sound simple enough (and actually the process can be made simpler still), it will likely lead to problems.

The first problem is... yes, those steps will work as written (once) but they'll leave you in a state where it isn't obvious what you should do next.

You mentioned committing your local changes, so lets suppose your starting point looks like this:

... O <--(origin/master)
     \
      A <--(master)

That is, you cloned the repo from origin, and master was at a commit O (which may have any arbitrary history; that won't matter here). You made some changes and committed them, creating commit A; your local master is now at A, which is checked out.

You do a git fetch, and it turns out there were new commits on origin/master. Now you have

... O -- R <--(origin/master)
     \
      A <--(master)

Your local branch and worktree are indeed unchanged, so far so good. Now you said you want to identify what files the remote has that you don't have on your local master. You could do something like

git diff --name-only --diff-filter=A master origin/master

This will give you the names of any files that would be added (--diff-filter=A) when moving from master to origin/master. In addition to files that were created by the new commits fetched from the remote, this would also show any files you deleted on your local master. If that's an issue, you could do something like

git diff --name-only --diff-filter=A $(git merge-base master origin/master) origin/master

This figures out where your local branch diverged from origin's, and compares origin/master to that instead of comparing it to your branch directly. On the other hand, if both you and the remote created a file, it will be included in the list if you use this version of the command, but not if you use the earlier one.

So if you are creating and/or removing files, then you have to think about which comparison makes sense (or look at both, or whatever); but if you're only editing existing files then you could use either.

Ok, all well and good, let's suppose you have your list. And you give that list to `git checkout. You can automate that if you like as well

git checkout origin/dev -- $(git diff --name-only --diff-filter=A master origin/master)

Now all the new files appear in your working directory (and index) as uncommitted new files. That may not seem to make sense - they were committed on the remote, right? But you still have your local commit A checked out, and in that context the files are uncommitted.

You could commit them. Then the picture would be something like

... O -- R <--(origin/master)
     \
      A -- B <--(master)

where B appears to create the files that were added from the remote. In other words, B contains some (but likely not all) of the changes from the remote. (And while the above picture shows just one new commit from the remote, in a real case the changes could have come from any number of new commits.)

That's ok, but it might cause some extra work if you ever want to recombine your branch with origin/master.

Now suppose you make some more changes, and fetch again from the remote:

... O -- R -- S -- T <--(origin/master)
     \
      A -- B -- C <--(master)

Well, you can again do

diff --name-status --diff-filter=A master origin/master

You can't really use the merge-base variation this time. If you've somewhere kept track of the fact that you previously incorporated changes up to R, then you could do

diff --name-status --diff-filter=A origin/master~2 origin/master

(based on seeing that there are 2 new commits after R). So the procedure is kinda-sorta repeatable, but it does get a little more complex as you go. I guess by always moving a tag to the last commit from which you'd copied files, you could make it a little easier.

But also, you're diverging more and more from what's going on in the remote. That might not matter..or it might.

For example, if the repo contains the source code for a single software project, then the changes you copy could depend on the changes you ignore (a risk that increases with time / number of changes brought in). Of course, depending on the nature of your repo, it's possible that doesn't apply to you, though that would be atypical.

So bottom line - maybe all that's fine with you, and this version of the procedure might do just what you want. But it's really not a way I recommend interacting with the remote on anything resembling a typical project.