Search code examples
javaperformancepath-finding

How many threads is better to use? (java)


I make app. which searches the files with special extension. I use multithreading: On each directory(small task) I create a Thread, which makes such work:

/**
 * Explore given directory.
 * @param dir - directory to explore.
 * @return snapshot of directory - FilesAndDirs object,
 * which encapsulates information about directory.
 */
public final FilesAndDirs exploreDirectory(final File dir) {
    final List<File> subDirectories = new ArrayList<File>();
    final List<File> files = new ArrayList<File>();
    if (dir.isDirectory()) {
        final File[] children = dir.listFiles();
        if (children != null) {
            for (File child : children) {
                if (child.isFile() && !child.isHidden()
                        && checkExtension(child)) {
                    files.add(child);
                } else {
                if (child.isDirectory() && !child.isHidden()) {
                        subDirectories.add(child);
                    }
                }
            }
        }
    }
    return new FilesAndDirs(files, subDirectories);
}

This method make a snapshot of given directory, and returs the result like FilesAndDirs object, which holdes List files and List subDirs. In another method (getFiles())I have List files - files with given extension - is the result of search and second List directories - it contains the subdirectories of every directory for explore method.
So Each thread explore the directory, files with given extension he put in List of results, subdirectories of given directory he put in list of subdirs and then we add it in list of directories of getFiles() method. I use fixed thread pool, but the problem is how many threads should i use to get the better performance? I read, that if task is not IO intensive, I should make number of therads equals number of available cores Runtime.getRuntime().availableProcessors(). Now time taken to explore C: and D: directories is 41 sec. But maybe I should use more threads or use some "magic" classes import java.util.concurrent. Here is getFiles() method: getFiles() method


Solution

  • Reading from hard drive is sequential, so multithreading here is not efficient. Your method is limited by I/O operations on hard drive. Not on your CPU power.