Search code examples
javascriptnode.jsalgorithmhashchecksum

How to make use of file hash or its use it downloading a file to a server Using Node JS


I have a feature on my program where in it downloads thousands of files to the server using using the url. Now i was told to hash the file before downloading it . Is that for Integrity check ? Is checksum a good solution ?, And also can anyone tell me what is the big use of using hashing algorithm on downloading files to a server. Is Hash value and Checksum a solution to that ? Would it lessen my cost on servers cause we know in product every "write" has a cost . Can anyone please explain to me the real use and purpose of hashing file before downloading it on the server. Thank you. How can we integrate that file hashing to my code below. Help would be much appreciated , thank you.

My code on downloading file to a server

final_list is an array of urls by the way

 var download = function (url, dest, callback) {

        request.get(url)
            .on('error', function (err) { console.log(err) })
            .pipe(fs.createWriteStream(dest))
            .on('close', callback);

    };

    final_list.forEach(function (str) {
        var filename = str.split('/').pop();

        console.log('Downloading ' + filename);

        download(str, filename, function () { console.log('Finished Downloading' + "" + filename) });
    });

Solution

  • you should ask the architect/management what is the real purpose of checksum/hashing in your product.

    hashing serves two purpose in the programs which downloads many files. First, you can have unique files stored based on hash (possibly saving few gigabytes of space), and second is to have this checked with services like virus total.

    One other use case for hash/checksum is that you can respond to copyright claims and remove all copies of same data.

    because you are downloading so many files, hashing and checksum may not seems to have much difference in this case. see detail answer explained in this Q for Hash and Checksum - Hash Code and Checksum - what's the difference?