Search code examples
bashshellcomparearchive

Compare two archive files BashShell


I'm new in Bash and I need help.

I need to create a shell script that shall compare two gzipped archives. For each file or directory in each archive file (even in archived subdirectories), the script shall verify whether a file/directory of the same name exists in the other archive. In case of a missing directory, ignore missing files or subdirectories within this directory. The script shall list the names of all files which do not have a matching equivalent in the other archive.

The output of script when comparing archives arch1.tar.gz and archive2.tar.gz and finding differing files aa/a.txt, bb/b.txt in archive.tar.gz and c.txt v arch2.tar.gz:

arch1.tar.gz:aa/a.txt

arch1.tar.gz:bb/b.txt

arch2.tar.gz:c.txt

Here what I have:

#!/bin/bash
$1
$2

tar tf $1>> list1.txt
tar tf $2>> list2.txt
comm -23 <(sort list1.txt -o list1.txt | uniq) <(sort list2.txt -o list2.txt| uniq)
diff list1.txt list2.txt>>contestboth

The thing is that I can't image anything for output.


Solution

  • Try this:

    diff <(sort -u list1.txt) <(sort -u list2.txt)
    

    By this two sub processes are started (the two sort commands) and their output is associated with file descriptors. The syntax <(...) returns a file name representing this file descriptor (something like /dev/fd/63). So in the end, diff is called with two files which, when read, (seem to) contain the output of the two processes.

    This method works fine for programs which read a file strictly linearly. Seeking in the "file" is not possible, of course.