Search code examples
linuxbashmd5sum

Generate MD5sum for all files in a directory, and get their matches in another directory


I have 2 directories. livedir contains 2000 fmx files and testdir contains 6000 fmbs with timestamps attached. I compiled all fmb in testdir to fmx to match them with fmx in livedir.

I have created the following script to get the MD5SUM of all files in livedir and search if they exist in testdir:

testdir='/home/oracle/ideatest/test/'
livedir='/home/oracle/ideatest/live/'
cd /home/oracle/ideatest/live/

for f in *; do
    livefile=$(md5sum "$f" | cut -d" " -f1)
    sourcefile=$(md5sum "$testdir""$f" | cut -d" " -f1)
    if [[ -f $f ]] && [ $livefile == $sourcefile ]; then
        echo "$f" "OK-----------------------------"
        echo "$sourcefilename"
        cp /home/oracle/bankplus/ideatest/test/$f  /home/oracle/bankplus/ideatest/live2/$f
        #el moshkla f 2sm el file 3ayzo mn 3'er hash
    else
        echo "$f" "MODIFIED"
    fi
done

The script works only when a file with the same name exist in the 2 directories. It's because I loop using the same name $f:

sourcefile=$(md5sum "$testdir""$f" | cut -d" " -f1)

As a result cp only copies one file although I have multiple files with the same hash value in testdir.


Solution

  • If your bash verion is 4.2 or later, how about making use of an associative array:

    #!/bin/bash
    
    testdir="/home/oracle/ideatest/test"
    livedir="/home/oracle/ideatest/live"
    
    declare -A hash
    
    # 1st step: create a hash table of md5sum in $testdir
    for f in $(find "$testdir" -type f); do
        md5sum=$(md5sum "$f" | cut -d" " -f1)
        hash[$md5sum]=${f##*/}  # holds md5sum as a key and filename as a value
    done
    
    # 2nd step: loop over files in $livedir and test if md5sum value of a file
    # exists in $testdir
    for f in $(find "$livedir" -type f); do
        basename=${f##*/}
        md5sum=$(md5sum "$f" | cut -d" " -f1)
        if [[ -n "${hash[$md5sum]}" ]]; then
            echo "$basename" "OK-----------------------------"
            echo "${hash[$md5sum]}"
            cp "/home/oracle/bankplus/ideatest/test/$basename" "/home/oracle/bankplus/ideatest/live2/$basename"
        else
            echo "$basename" "MODIFIED"
        fi
    done
    

    Hope this helps.