I am working on a group project and I want to remove a file from all memory. The content, the file name, everything! I don't want any trace of this left on the Git repo. I have been trying to do this using bfg
but I can still find the file on the Github page using it's "browse the repository at this point in history feature".
The directory which is the git repo is .../electricity_profiles
and within the directory electricity_profiles/data
there was the file I want to remove (I've tried bfg --delete-files .~lock.smart_meter_data_overlap.csv#
). I have removed it from the current commit since, but it is a few commits back commit 5c50c67d1be4e869bc75fb7d3916b9fc814b8106
.
How can I remove all evidence this file ever existed, even on github, and so when other people pull the file they won't see it?
I have looked at:
but haven't figured it out yet.
Work done so far: (Seems to work).
git clone --mirror https://github.com/oliversheridanmethven/electricity_profiles.git
bfg --delete-files .~lock.smart_meter_data_overlap.csv# electricity_profiles.git
Console output:
Using repo : /home/user/Documents/InFoMM/case_studies/trial/electricity_profiles.git
Found 20 objects to protect
Found 2 commit-pointing refs : HEAD, refs/heads/master
Protected commits
-----------------
These are your protected commits, and so their contents will NOT be altered:
* commit 1b1eef47 (protected by 'HEAD')
Cleaning
--------
Found 22 commits
Cleaning commits: 100% (22/22)
Cleaning commits completed in 141 ms.
Updating 1 Ref
--------------
Ref Before After
---------------------------------------
refs/heads/master | 1b1eef47 | 9701a5b7
Updating references: 100% (1/1)
...Ref update completed in 26 ms.
Commit Tree-Dirt History
------------------------
Earliest Latest
| |
......D..D..m.m.mmmmmm
D = dirty commits (file tree fixed)
m = modified commits (commit message or parents changed)
. = clean commits (no changes to file tree)
Before After
-------------------------------------------
First modified commit | 5c50c67d | ff47bcdf
Last dirty commit | 9671f6ad | f6d36763
Deleted files
-------------
Filename Git id
------------------------------------------------------
.~lock.smart_meter_data_overlap.csv# | 7cf2b24f (92 B)
In total, 14 object ids were changed. Full details are logged here:
/home/user/Documents/InFoMM/case_studies/trial/electricity_profiles.git.bfg-report/2017-01-18/11-48-37
BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive
finishing off the process.
cd electricity_profiles.git
git push --mirror https://github.com/oliversheridanmethven/electricity_profiles.git
Looking at the Github repo it seems to have worked.
I'm the author of the BFG - I re-titled your question to "Why can I still see files in GitHub history after cleaning them with the BFG?" because it likely better represents your issue.
Your question description does not make this entirely clear, but I am guessing that in the report from the BFG run, the BFG did report it had deleted files (if the BFG had found no targets for deletion, it would have reported that as an error, and you don't mention seeing that, so my guess is that the BFG did find you files, and deleted them from history).
First off, you need to make sure you were following all the steps at https://rtyley.github.io/bfg-repo-cleaner/#usage, particularly:
mirror
repoIf you followed all those steps correctly, why could you still see files in GitHub history after cleaning them with the BFG? A possible explanation is that GitHub has not done garbage collection on that repo yet. GitHub only does GC periodically, so old commits are still visible for some time afterwards: