Search code examples
gitgit-rewrite-history

Is it possible to manually modify move/new/edit mappings on a git commit?


I have already read other questions (1, 2, 3) about how to make git realize about a concrete move, but they don't answer my real doubt: is it possible to manually handle git move understandings while commiting, before the commit, or afterwards (altering the commit internal data). I am open to standard answers, and even to hacking the git repository files.

I am taking this seriously because letting git know a file has been moved, edited, replaced, etc, is very important when lately reviewing the file editions with any software, since the software will be able to accordingly show the file editions no matter which file moves or renames the developer did. I think it's a valuable info the commiter should take care of properly setting, since this way the commit saves more that a FS operation, but also the logical intentions of the developer, the true meaning of the project edits.

Usage cases:

Case 1:

  • move file main_configuration.txt to configurations/production/configuration.txt
  • create main_configuration.txt, add similar content to the previous but change a few lines

git understands that you edited a few lines on configuration.txt and you added a new file configurations/production/configuration.txt. but I don't want to loose track of the production configuration file edits. it isn't a new file created on this commit :(

Case 2:

  • delete file a/a.txt
  • create file b/a.txt with simmilar content

git understands a file move, but I do need the git history to properly explain that a/a.txt has been deleted in this commit, and I need to keep the data that b/a.txt has been created on this commit. It's a very important info and the final info that git tells is a severe mistake which can have analysis consequences.

There are lots of examples and others even more contextualized but I tried to make them as simple as they could be.


Solution

  • I'd like to close this as a duplicate of How does git handle moving files in the file system? but you've already referenced that in your question. I think from the comments you've gotten the answer, but let's put one in place formally:

    • Git stores snapshots. Deltas—diffs—do not enter the picture at the level at which Git actually works with the files. (They do occur "below" that level, inside pack-files, as Lasse Vågsæther Karlsen notes in a comment. It's worth mentioning that these deltas, which use a modification of xdelta, are not line-by-line; they're byte-range-by-byte-range. So these are not what Git shows you!)

    • Git does not store the programmer's intent. Git just stores a snapshot of each file; it must, at git diff or git show time, attempt to reconstruct the logical intentions of the developer as you put it.

    • Hence, as you concluded, the move information which Git is able to display in the log command ... is not stored on the commit but computed at display time.

    You should think of git diff (and hence git log -p) output as instructions to a computer, or maybe a human, about how to change the file on the left to make it match the file on the right. It doesn't matter how the change actually happened; Git just tries to come up with a minimal(ish) set of instructions to make it happen again, if you want it to happen again. This is true even if you diff the very first commit in the repository against the very last one: Git skips over all the intermediate commits, extracting the first and last snapshot, and computes a change-set that will take you from the first to the last.


    As a final conclusion, in order to properly document the developer intents and allow later software-based file edit history, those strategic and maybe misinterpretable changes can be split across several commits so the copy, edit, move or delete operation is explicit and can't be hidden by two actions overlapping. It's up to the developer abilities to finally organize the changes in well documented, understandable, self-explaining and high quality commits.