Search code examples
gitstream-compaction

Does git have a concept of log compaction?


The git version control system, is a kind of distributed log (with some conceptual similarities to the raft consensus protocol).

Raft and some other systems have a concept of log compaction, so new clients don't need to traverse the whole change set to apply changes.

My question is: Does git have a concept of log compaction?


Solution

  • new clients don't need to traverse the whole change set to apply changes.

    No, git is a collection of snapshots (full copy of a working tree).
    When you access a commit in git, you don't have to traverse the all log or history to build its content.

    See "How does git store files?": the internal storage does use delta in pack files (that is form of "compaction", not just "log compaction"), but each commit still represents the full working tree.

    https://i.sstatic.net/AQ5TG.png

    Every time you commit, or save the state of your project in Git, it basically takes a picture of what all your files look like at that moment and stores a reference to that snapshot.
    To be efficient, if files have not changed, Git doesn’t store the file again—just a link to the previous identical file it has already stored.