Search code examples
gitvisual-studioazure-devops

Must delete massive inadvertent checkin from remote git repo


My situation is that a massive, 2GB file has been inadvertently pushed to our remote git repo on Azure DevOps. I noticed this when reviewing the pull request.

I cannot have a file that big in my repo and need to delete it. From the history as well. I don't want someone cloning the repo to pull down a pointless 2 GB file

The checkin is still on a branch. It has not been merged to the master. But it is on the remote repo

As long as nobody else is using this branch, is it safe for my user to do a git reset on this branch? I'm told that a git revert will not delete the file from the history and that I need a git reset.

Is this my path to a fix?


Solution

  • I'm told that a git revert will not delete the file from the history and that I need a git reset. Is this my path to a fix?

    Mostly, yes. The user should rewrite their branch such that the large file no longer exists in any commit on that branch. There are a few different ways to do this, and reset is one of them. Presumably their are other changes on the branch that the user still needs, so you can make a commit that deletes the file, and then interactive rebase and squash the delete commit into the commit that added the file. Regardless of what method you use to rewrite the branch, you'll need to simply force push that branch out to update the PR.

    Note there are some caveats here:

    1. Once the repo doesn't have any refs (e.g. branches, tags, etc.) pointing to the big bad commit, all new clones won't contain the large file, but it won't instantly disappear from all the existing copies of the repo that already fetched that commit. It will eventually get garbage collected, but after the branch is force pushed I would recommend figuring out the exact command(s) you need to either force a gc or prune that commit from your repo, test it, and share the instructions with the rest of your users.
    2. Azure DevOps might never delete that commit, even if it has been removed from all local copies of the repo. In other words, the commit will be orphaned on the server, probably forever. If you know the commit ID, or any historical PR (even abandoned) that referenced it, you will still be able to navigate to, and access the original commit. From there you could even "revive" the commit by creating a remote branch at that commit, which would make the file reappear in fetches and clones. The fact that this capability exists normally shouldn't be a concern, but if the fact that the file is never truly deleted from the remote server is a problem for you, you may have to involve MS support to see if they can somehow prune that commit from the history. (Note whether this is possible may also depend on whether you're using AzDO Server or Services.)

    Tip: In AzureDevOps, under Project Settings->Repositories, there are options for the entire system and also each individual repo, where you can set the Maximum file size that can be pushed. I recommend you enable that setting to a sensible option. I would also recommend setting it a little too low rather than too high, since you can temporarily disable the setting in the rare instance they need to purposefully push something above the limit. (Of course, this assumes it will be rare enough that an admin temporarily disabling and reenabling afterwards won't be a bottleneck or annoyance to developers.)