Search code examples
gitbitbucketgit-remotegit-rewrite-history

git rewrite history on remote


Disclaimer: I'm sure the title of this questions has already thrown up all your red flags and has made you want to either flame me or give me some explanations about why rewriting history is a grave sin. That's fine, you can do that, but please also address the question that I have.

I made some very, very bad mistakes in my git repo which blew it up to several gigabytes, yikes! Luckily, this is my own private repo, so I went through and rewrote history to remove all the mistakes. I followed these instructions, basically:

  1. Find all the files I want to remove.
  2. Use git-filter-branch to remove them.
  3. Remove the records from the logs and refs
  4. Re-garbage collect.

Doing this, I got the repo down from several gigabytes to a single megayte, yay! I followed up with a git push origin --force --all, and everything updated successfully.

Now, I want to clone my repo into a new system to test that my rewrite worked. When I clone, I get this:

remote: Enumerating objects: 6359, done.
remote: Counting objects: 100% (6359/6359), done.
remote: Compressing objects: 100% (4857/4857), done.
fatal: The remote end hung up unexpectedly 1.70 GiB | 3.05 MiB/s

This seems strange to me. First of all, my repo is no longer several gigabytes, it is only 1 megabyte. Additionally, I do not have 6539 objects. After the rewrite, I only have 2128 objects. It seems like it is downloading the repo before I pushed the rewrite.

So my question is: How can I get the remote repo to see that I've rewritten history to make the repo much, much smaller?

Thanks!

P.S. The repo is on bitbucket.


Solution

  • Double check there are no old branches on bitbucket that point to the un-filtered version of the project. It's possible that, for example, bitbucket protects the master branch from being force pushed, so it is still pointing to the old code. Running a git fetch then gitk --all --remotes on your old repo should be a good way to show you the state of the upstream repo.

    The remote end should not have hung up unexpectedly either way, even if there was a lot of data to download. That sounds like a problem with your internet connection or with bitbucket.

    Note: Rewriting history is a totally reasonable thing to do in cases like this, or if e.g. you have committed secret or GDPR protected information to your repository.