Search code examples
pythonhashdirectorymd5

Get all files md5 hash in a directory with subdirectories in python


I am new to python. I want to ask that how do I get all files md5 hash in a directory with subdirectories? Currently my coding only able to get md5 hash of files in a folder. Any suggestion or codes need to make changes? Below is my coding:

import hashlib
import glob
import os.path
filenames = glob.glob('C:/Users/User/Desktop/irustesting/*')
def md5(fname):
    hash_md5 = hashlib.md5()
    with open(fname, "rb") as f:
        for chunk in iter(lambda: f.read(2 ** 20), b""):
            hash_md5.update(chunk)
    return hash_md5.hexdigest()

viruslist = open('C:/FYP/SecuCOM2022/viruslist.txt','rt')
virusinside = [l.rstrip() for l in viruslist]
virus="detected"
novirus="clear"

for filename in filenames:
    print(filename, md5(filename))
    if md5(filename) in virusinside:
        print(virus)
        os.remove(filename)
    else:
        print(novirus)  

Solution

  • You can retrieve all the absolute paths of the files using the os.walk() method by defining the following function:

    import os
    
    def get_all_abs_paths(rootdir):
       paths = list()
       for dirpath,_,filenames in os.walk(rootdir):
          for f in filenames:
             paths.append(os.path.abspath(os.path.join(dirpath, f)))
    return paths
    

    In this way you can use your code by replacing the glob call to assign the filenames variable with filenames=get_abs_paths('C:/Users/User/Desktop/irustesting')