Search code examples
javamultithreadingjava.util.scannerbufferedreader

MultiThread runs slower than single process


for an assignment in school I was asked to create a simple program that creates 1000 text files, each with a random amount of lines, count how many lines are there via multi-thread\single process. than delete those files.

now a strange thing happens during testing - linear counting of all files is always a bit faster than counting them in a multi-threaded way which has sparked quite the academic theorizing session within my classroom circle.

when using Scanner to read all files, everything works as intended - 1000 files are read at around 500ms linear time and 400ms threaded time

yet when i use BufferedReader times drop to around 110ms linear and 130ms threaded.

which part of the code causes this bottleneck and why?

EDIT: Just to clarify, I'm not asking why does Scanner works slower than BufferedReader.

the full compile-able code: (though you should change the file creation path output)

import java.io.*;
import java.util.Random;
import java.util.Scanner;

/**
 * Builds text files with random amount of lines and counts them with 
 * one process or multi-threading.
 * @author Hazir
 */// CLASS MATALA_4A START:
public class Matala_4A {

    /* Finals: */
    private static final String MSG = "Hello World";

    /* Privates: */
    private static int count;
    private static Random rand;

    /* Private Methods: */ /**
     * Increases the random generator.
     * @return The new random value.
     */
    private static synchronized int getRand() {
        return rand.nextInt(1000);
    }

    /**
     * Increments the lines-read counter by a value.
     * @param val The amount to be incremented by.
     */
    private static synchronized void incrementCount(int val) {
        count+=val;
    }

    /**
     * Sets lines-read counter to 0 and Initializes random generator 
     * by the seed - 123.
     */
    private static void Initialize() {
        count=0;
        rand = new Random(123);
    }

    /* Public Methods: */ /**
     * Creates n files with random amount of lines.
     * @param n The amount of files to be created.
     * @return String array with all the file paths.
     */
    public static String[] createFiles(int n) {
        String[] array = new String[n];
        for (int i=0; i<n; i++) {
            array[i] = String.format("C:\\Files\\File_%d.txt", i+1);
            try (   // Try with Resources: 
                    FileWriter fw = new FileWriter(array[i]); 
                    PrintWriter pw = new PrintWriter(fw);
                    ) {
                int numLines = getRand();
                for (int j=0; j<numLines; j++) pw.println(MSG);
            } catch (IOException ex) {
                System.err.println(String.format("Failed Writing to file: %s", 
                        array[i]));
            }
        }
        return array;
    }

    /**
     * Deletes all the files who's file paths are specified 
     * in the fileNames array.
     * @param fileNames The files to be deleted.
     */
    public static void deleteFiles(String[] fileNames) {
        for (String fileName : fileNames) {
            File file = new File(fileName);
            if (file.exists()) {
                file.delete();
            }
        }
    }

    /**
     * Creates numFiles amount of files.<br>
     * Counts how many lines are in all the files via Multi-threading.<br>
     * Deletes all the files when finished.
     * @param numFiles The amount of files to be created.
     */
    public static void countLinesThread(int numFiles) {
        Initialize();
        /* Create Files */
        String[] fileNames = createFiles(numFiles);
        Thread[] running = new Thread[numFiles];
        int k=0;
        long start = System.currentTimeMillis();
        /* Start all threads */
        for (String fileName : fileNames) {
            LineCounter thread = new LineCounter(fileName);
            running[k++] = thread;
            thread.start();
        }
        /* Join all threads */
        for (Thread thread : running) {
            try {
                thread.join();
            } catch (InterruptedException e) {
                // Shouldn't happen.
            }
        }
        long end = System.currentTimeMillis();
        System.out.println(String.format("threads time = %d ms, lines = %d",
                end-start,count));
        /* Delete all files */
        deleteFiles(fileNames);
    }

    @SuppressWarnings("CallToThreadRun")
    /**
     * Creates numFiles amount of files.<br>
     * Counts how many lines are in all the files in one process.<br>
     * Deletes all the files when finished.
     * @param numFiles The amount of files to be created. 
     */
    public static void countLinesOneProcess(int numFiles) {
        Initialize();
        /* Create Files */
        String[] fileNames = createFiles(numFiles);
        /* Iterate Files*/
        long start = System.currentTimeMillis();
        LineCounter thread;
        for (String fileName : fileNames) {
            thread = new LineCounter(fileName);
            thread.run(); // same process
        }
        long end = System.currentTimeMillis();
        System.out.println(String.format("linear time = %d ms, lines = %d",
                end-start,count));
        /* Delete all files */
        deleteFiles(fileNames);
    }

    public static void main(String[] args) {
        int num = 1000;
        countLinesThread(num);
        countLinesOneProcess(num);
    }

    /**
     * Auxiliary class designed to count the amount of lines in a text file.
     */// NESTED CLASS LINECOUNTER START:
    private static class LineCounter extends Thread {

        /* Privates: */
        private String fileName;

        /* Constructor: */
        private LineCounter(String fileName) {
            this.fileName=fileName;
        }

        /* Methods: */

        /**
         * Reads a file and counts the amount of lines it has.
         */ @Override
        public void run() {
            int count=0;
            try ( // Try with Resources:
                    FileReader fr = new FileReader(fileName);
                    //Scanner sc = new Scanner(fr);
                    BufferedReader br = new BufferedReader(fr);
                    ) {
                String str;
                for (str=br.readLine(); str!=null; str=br.readLine()) count++;
                //for (; sc.hasNext(); sc.nextLine()) count++;
                incrementCount(count);
            } catch (IOException e) {
                System.err.println(String.format("Failed Reading from file: %s", 
                fileName));            
            }
        }
    } // NESTED CLASS LINECOUNTER END;
} // CLASS MATALA_4A END;

Solution

  • There can be different factors:

    • Most important is avoiding disk access from multiple threads at the same time (but since you are on SSD, you might get away with that). On a normal harddisk however, switching from one file to another could cost you 10ms seek time (depending on how the data is cached).

    • 1000 threads is too much, try to use number of cores * 2. Too much time will be lost switching contexts only.

    • Try using a thread pool. Total times are between 110ms and 130ms, part of that will be from creating threads.

    • Do some more work in the test in general. Timing 110ms isn't always that accurate. Also depends on what other processes or threads are running at that time.

    • Try to switch the order of your tests to see if it makes a difference (caching could be an important factor)

      countLinesThread(num);
      countLinesOneProcess(num);
      

    Also, depending on the system, currentTimeMillis() might have a resolution of 10 to 15ms. So it isn't very accurate to time short runs.

    long start = System.currentTimeMillis();
    long end = System.currentTimeMillis();