Search code examples
bashmacosdiffgit-diff

diff between folders whilst ignoring filename changes


How can I use diff in terminal but ignore changes in file names?

Currently this is what i'm doing:

diff -wrN folder1 folder2 | grep '^>' | wc -l

How can I do git diff between two commit ids whilst:

  • ignoring file rename
  • only look at java files
  • ignore specific folder names e.g. folder 'a' and 'b'
  • perform the grep '^>' | wc -l

Solution

  • You seem unaware of the hardness of this problem, so I'd like to point out why this is so difficult.

    Given two directories which are equal in the beginning and both contain, say, 1000 files. Now you rename, say, 500 files in one of the directories. Renamings can vary greatly. A file called foobar.txt originally can be named DSC-3457.orig.jpg afterwards. The diff command cannot really find it again without having any idea about what has been renamed into what.

    Additionally, a file called x could be renamed to y, while a file called y could be renamed to x. In this case it even is questionable whether this should be regarded a mere renaming or if simply both files' contents have been exchanged.

    This all means that in general you will have large problems to accomplish this. Standard tools will not do this out-of-the-box.

    This said, I have two aspects I want to point out which might help you.

    1. File Sizes

      You can sort all files by their file sizes and then diff each pair of the two directories. This can work perfectly well if all changes you have are only renamings and if all files are of different size. If you have several files of the same size (maybe by pure chance or because they are all of the same format which has a fixed size), you are in trouble again and will have to compare each possible pair of the same-size group.

    2. Git-Diff

      You mentioned git-diff in the tags. git actually keeps a record in case a file is renamed. So if you intend to use git diff, you can rely to some degree on git's ability to detect renamings. This typically works if a file is removed and added with a new name in one single commit. If it gets added with a new name in one commit and then the older version is removed in another commit, this won't work properly. There is a lot more to learn about renames in git diff; see man git diff and search for rename in this case, there are about a dozen places this gets mentioned, so I won't try to summarize this here myself.

      EDIT: You can use a command like git diff --find-renames --diff-filter=ACDMTUX (i. e. you let all kinds of changes pass the filter with the exception of renamings).