git log --oneline command not displaying recent commit

Interesting issue I ran into at work the other day. So I'm a new data engineer learning the ropes. I made a git commit of two files to my remote branch and made a PR, waiting to be reviewed for merging.

However, my manager told me to suppress one of the files as there were barely any changes. The novice in me made a big no no, and from this window deleted the file. (For examples sake, news/reddit.py is the sample file

Now, this obviously makes a new commit. I would like to revert these changes using git revert <commithash> --no-edit. The issue I'm having is when I run git log --oneline in the terminal, there is no hash for the commit created upon file deletion.

Now when I grab the hash from github here:

and run git revert 738bd6a --no-edit

I receive this error: fatal: bad revision '738bd6a'

Also running the revert command with the full hash I receive this: fatal: bad object 738bd6a3bccab7d96e9e7f93871285eea72943cf

I know I can just close the PR and recommit the files, but wanted to see if anyone had any ideas why the hash's won't work and why I can't find the hash running git log --oneline

Thanks!

Solution

I think the answer here is that you have mixed up GitHub and Git. GitHub is a site that hosts Git repositories and lets you do various things with the hosted repositories, including a lot of add-on services not offered by base Git. Git itself is software that:

does version control (to some extent)
by providing a distributed+replicated database of commit objects that store files.

So, you install Git on your own computer (such as your laptop), and there, you create and manipulate one or more Git repositories. These repositories contain commits, and most of the things that you do will eventually result in adding new commits.

New commits you add to your repository, on your laptop, are only in your repository on your laptop. To get these new commits to some other Git—some other repository-managing software, typically on some other machine—you will send the commits to them, with git push. If that other Git resides on one of GitHub's computers, that's fine: now they have your new commits (plus any old ones: when one Git talks with another, they generally send all the commits they possibly can, although the details here are complicated).

When you use git clone on your laptop, you are having your Git (your software on your laptop) create a new, empty repository, then save away a URL for some other Git, such as one over on GitHub. You then have your Git connect to their Git and copy all of their commits into your repository. That copying step happens with git fetch, which is the opposite—or as close as Git gets to one—of git push. You would think git pull would be the opposite, but it's not.

So, what went wrong? Well, you made some commit(s) on your laptop, sent them to GitHub, and used the GitHub clicky web interface to make what GitHub call a pull request, which is a GitHub concept. (Base Git doesn't have Pull Requests.) So far, we're all good. But then you used the GitHub clicky web interface to add a new commit to your repository over on GitHub. That commit is there, in that Git repository on GitHub, but it's not on your laptop.

So, then you went over to your laptop Git and told it to revert a commit, giving it the hash ID of the commit that only exists in the repository over on GitHub. Your Git said: What? What the heck are you talking about? I don't have a 738bd6a. That's because ... it doesn't have that commit.

To get your Git to have that commit, you must connect your Git (your software working in your laptop repository) to their Git (their software running on your GitHub repository), using git fetch or git pull. The git fetch step is the most critical one: your Git will call up their Git, they'll list out their branch names and hash IDs, and your Git will see 738bd6a3bccab7d96e9e7f93871285eea72943cf and think to itself: Oh, hey, I don't have that commit! I'd better ask for it! Your Git will ask their Git to send it over, and they will, and now you'll have 738bd6a3bccab7d96e9e7f93871285eea72943cf.

While git fetch alone will get this commit, that's not quite enough. Your repository on your laptop has commits, which your Git and their Git will share with each other. Their repository has commits, which your Git and their Git will share with each other. But both your and their repository have branch names too, and these names are not shared: not exactly.

Instead, when your Git is talking to theirs, getting commits from them, your Git sees their branch names, but changes those names. Instead of main or master, your Git adds your remote name, origin, in front, plus a slash, so that the name is now origin/main or origin/master. This is not a branch name, but rather a remote-tracking name. Your Git stuffs the new commits in your own repository, and creates or updates any remote-tracking names as needed so as to be able to find those commits.¹

But: you do your work on your branch names. So once you have some new commit(s) that you got from someone else, you may want to update your own branches. To do that, you'll generally want to run git merge, or git rebase, or some other additional Git command. Exactly which command to use, when, and why, gets pretty involved, and includes cultural details of your particular workplace, so it's hard to say which one you should reach for first. For this particular case, though, merge is probably the right answer.

Git's git pull command runs git fetch for you, and then—providing that the fetch step worked²—runs a second Git command, usually git merge or git rebase.³ So if you know that the fetch step will work and do something particular, you can use git pull as a shortcut, instead of running the two separate commands. I advise newbies to use the two separate commands anyway, because it helps in multiple ways:

It keeps clear the separation between git fetch (always safe, almost always works) and the second command (sometimes not safe, often doesn't work).
It allows you to see what git fetch fetched before you choose which second command to run.
When the second command fails, you know which command you ran, and therefore, what are the next steps when getting help (it's different for merge vs rebase).

The fact that Git is distributed, so that every Git clone can have a full copy of the entire repository (up to the point at which it was cloned or last re-updated with fetch), is tricky for many Git newbies. Add on the complexities of various other Git commands and Git becomes a sinkhole for many programmers. I'd encourage you to:

run git log --all --graph or git log --all --graph --oneline; see also Pretty Git branch graphs
then, run git fetch and repeat the git log --graph.

See how the one incoming commit updated origin/<somebranch>. Note how new commits add on to the ends of branches (as remembered by your remote-tracking names), "moving the branch forward", as it were. Your Git now mirrors what you did using GitHub's Git. You can now update your own branch, which in this case will just move your branch name forward, using git merge or git merge --ff-only. Then you can revert the commit, adding yet another commit onto the end of your own (local) branch. Run git log --graph again along the way to observe these updates.

Finally, you can git push the new commit to GitHub's Git. This will add the commit to the chain, and show you one other peculiarity: when you use git fetch, you get commits from them, and then your Git updates your remote-tracking names. There's one remote-tracking name in your repository for every one of their branch names, so this is a safe thing to do. But: when you run git push, you send them a new commit, and then ask them to update one of their branch names. They don't have an equivalent of remote-tracking names here. You just have them update their branch names.

Since we (and Git) like to find commits using branch names, this eventually gives rise to the problem you will (eventually) run into, where their Git says "I won't accept this name-update request because it's a non-fast-forward". See, e.g., Git push rejected "non-fast-forward". Sometimes you want to force them to take this update anyway: Yes, throw away some commits, these are their new-and-improved replacements. Sometimes you don't.

¹You might wonder why we need names to find commits. Technically, we don't: you can use 738bd6a3bccab7d96e9e7f93871285eea72943cf all the time. But are you going to remember 738bd6a3bccab7d96e9e7f93871285eea72943cf tomorrow? If it has a nice simple and memorable name as an alias, like foobranch or origin/foobranch, now that you can remember.

Moreover, branch names automatically update as you add new commits. Remote-tracking names automatically update when you run git fetch, which adds any new commits you got from them, which is kind of the same, only different. The big difference between a branch name and a remote-tracking name, aside from the fact that your Git creates and updates your remote-tracking names from their branch names, is that git switch origin/branch refuses to run, and git checkout origin/branch puts you in "detached HEAD" mode, which is not a good way to do normal every-day work.

In short: branch names = where you do your stuff. Remote-tracking names = your Git's memory of their Git's branches. Use git fetch to update your remote-tracking names.