Search code examples
javamultithreadingarraylistconcurrency

Calculating hash in parallel by multiple threads and adding the outputs in an ArrayList<String>


I've written the following code in order to calculate the hash of a String (based on SHA-256) and then inserting all the outputs in an ArrayList<String> :

        ArrayList<Thread> threadList = new ArrayList<Thread>();
        ArrayList<String> threadListStr = new ArrayList<String>();
        int threadNumber = 100;
        for (int i = 0; i < threadNumber; i++) {
            String tId = String.valueOf(i);
            Thread thr = new Thread(() -> {
                threadListStr.add(calculateHash(tId));
            });
            threadList.add(thr);
        }

        // START the threads
        for (int i = 0; i < threadNumber; i++) {
            threadList.get(i).start();
        }
        // STOP the threads
        for (int i = 0; i < threadNumber; i++) {
            threadList.get(i).interrupt();
        }

        System.out.println("Size of ArrayList<String> is: " + threadListStr.size());
        System.out.println("Size of ArrayList<Thread> is: " + threadList.size());
        
        /////////////////////
        
        public static String calculateHash(String tId) {
        String tIdStr = org.apache.commons.codec.digest.DigestUtils.sha256Hex(tId);
        return tIdStr;
        }

However, the ArrayList<String> does not become complete and as you see after running the code 5 times, each time, ArrayList<String> has a different size (despite the ArrayList<Thread> threadList is always complete, as the number of threads is 100.)

//1th run
Size of ArrayList<String> is: 60
Size of ArrayList<Thread> is: 100

//2nd run
Size of ArrayList<String> is: 30
Size of ArrayList<Thread> is: 100

//3rd run
Size of ArrayList<String> is: 10
Size of ArrayList<Thread> is: 100

//4th run
Size of ArrayList<String> is: 61
Size of ArrayList<Thread> is: 100

//5th
Size of ArrayList<String> is: 69
Size of ArrayList<Thread> is: 100

How should be the code modified so that ArrayList<String> stores all the outputs completely?

EDITE: I changed the code as follows, but the output is the same.

        ArrayList<Thread> threadList = new ArrayList<Thread>();
        //ArrayList<String> threadListStr = new ArrayList<String>();
        List<String> threadListStrSync = Collections.synchronizedList(new ArrayList<>());
        int threadNumber = 100;
        for (int i = 0; i < threadNumber; i++) {
            String tId = String.valueOf(i);
            Thread thr = new Thread(() -> {
                threadListStrSync.add(calculateHash(tId));
            });
            threadList.add(thr);
        }

        // START the threads
        for (int i = 0; i < threadNumber; i++) {
            threadList.get(i).start();
        }
        // STOP the threads
        for (int i = 0; i < threadNumber; i++) {
            threadList.get(i).interrupt();
        }

        System.out.println("Size of ArrayList<String> is: " + threadListStrSync.size());
        System.out.println("Size of ArrayList<Thread> is: " + threadList.size());

Note: I commented interrupt(); but the output is still the same.


Solution

  • There are multiple problems

    1. either use a threadsafe collection or manually synchronize access - one easy option is to wrap your list with Collections.synchronizedList()
    2. interrupt()is not needed, the threads will terminate anyway when they reach the end of their run()-method
    3. you need to wait for all threads to terminate before printing results - to do that, invoke join() instead of interrupt()