Search code examples
javamultithreadingparallel-processingwait

how to join all threads after all files are visited?


I'm creating a CLI Search Engine for my project and I need to traverse the path given by the user. I'm using FileVisitor. To make things faster, I implemented multi-threading.

Here is FileCrawler.java:

public class FileCrawler implements FileVisitor<Path> {

  private static final int NUM_THREADS = 5;
  private final List<String> supportedFileTypes = Arrays.asList("txt", "pdf", "doc", "docx", "ppt", "xls", "xlsx", "csv");

  private final ExecutorService executor;
  private final Indexer indexer;

  public FileCrawler() {
    this.executor = Executors.newFixedThreadPool(NUM_THREADS);
    this.indexer = new Indexer();
  }

  @Override
  public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) throws IOException {
    return FileVisitResult.CONTINUE;
  }

  @Override
  public FileVisitResult visitFile(Path path, BasicFileAttributes attrs) throws IOException {
    String fileType = getFileType(path);
    if (fileType != null && supportedFileTypes.contains(fileType)) {
      executor.execute(new IndexTask(path.toFile(), fileType));
    }
    return FileVisitResult.CONTINUE;
  }

  @Override
  public FileVisitResult visitFileFailed(Path path, IOException exc) throws IOException {
    System.err.println("Failed to visit file: " + path);
    return FileVisitResult.CONTINUE;
  }

  @Override
  public FileVisitResult postVisitDirectory(Path dir, IOException exc) throws IOException {
    return FileVisitResult.CONTINUE;
  }

  private String getFileType(Path path) {
    String fileName = path.getFileName().toString();
    int dotIndex = fileName.lastIndexOf('.');
    return (dotIndex == -1) ? "" : fileName.substring(dotIndex + 1);
  }

  private class IndexTask implements Runnable {
    private final File file;
    private final String fileType;

    public IndexTask(File file, String fileType) {
      this.file = file;
      this.fileType = fileType;
    }

    @Override
    public void run() {
      indexer.index(file, fileType);
    }
  }
}

Main:

Thread fileCrawlerThread = new Thread(() -> {
  try {
    Files.walkFileTree(searchDirectoryPath, fileCrawler);
  } catch (IOException e) {
    System.err.println("Error while walking the file tree: " + e.getMessage());
  }
});

fileCrawlerThread.start();

try {
  fileCrawlerThread.join();
  System.out.println("Index file has been created successfully!");
} catch (InterruptedException e) {
  e.printStackTrace();
}

while (true) {
  System.out.println("Please enter the query you want to search (Press q to quit): \r");
  String searchQuery = scanner.next();
  if (searchQuery.equals("q")) break;      
  // Some code here
}

When all files are visited, I want my main method to enter the while loop but it doesnt. how do I join all threads?

I tried CountDownLatch however, I couldn't fix my issue.


Solution

  • As matt pointed out, I didn't need fileCrawlerThread.start() and fileCrawlerThread.join(). So, I removed them. I changed the executor in FileCrawler.java to public:

    public final ExecutorService executor;
    

    then changed my Main method to:

    try {
      Files.walkFileTree(searchDirectoryPath, fileCrawler);
      fileCrawler.executor.shutdown();
      fileCrawler.executor.awaitTermination(10, TimeUnit.MINUTES);
    } catch (IOException | InterruptedException e) {
      //Do something here..
    }
    

    now, when a thread doesn't have any remaining tasks, it shuts down and waits for other threads. then the program runs in a single thread as expected.