I am trying to write a Java multithreaded program performing a multiplication on 2 matrices given as a file and using a limited total of threads used.
For example if I set a number of thread at 16 I want my threadpool to be able to reuse those 16 threads until all the tasks are done.
However I end up with a larger execution time for a larger number of threads and I am having a hard time trying to understand why.
Runnable:
class Task implements Runnable
{
int _row = 0;
int _col = 0;
public Task(int row, int col)
{
_row = row;
_col = col;
}
@Override
public void run()
{
Application.multiply(_row, _col);
}
}
Application:
public class Application
{
private static Scanner sc = new Scanner(System.in);
private static int _A[][];
private static int _B[][];
private static int _C[][];
public static void main(final String [] args) throws InterruptedException
{
ExecutorService executor = Executors.newFixedThreadPool(16);
ThreadPoolExecutor pool = (ThreadPoolExecutor) executor;
_A = readMatrix();
_B = readMatrix();
_C = new int[_A.length][_B[0].length];
long startTime = System.currentTimeMillis();
for (int x = 0; x < _C.length; x++)
{
for (int y = 0; y < _C[0].length; y++)
{
executor.execute(new Task(x, y));
}
}
long endTime = System.currentTimeMillis();
executor.shutdown();
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.HOURS);
System.out.printf("Calculation Time: %d ms\n" , endTime - startTime);
}
public static void multMatrix(int row, int col)
{
int sum = 0;
for (int i = 0; i < _B.length; i++)
{
sum += _A[row][i] * _B[i][col];
}
_C[row][col] = sum;
}
...
}
The matrix calculations and workload sharing seems correct so it might come from a bad use of ThreadPool
Context switching takes time. If you have 8 cores and you are executing 8 threads they all can work simultaneously and as soon as one finishes it will be reused. On the other hand if you have 16 threads for 8 cores each thread will compete for the processor time and scheduler will switch those threads and your time would increase to - Execution time + Context swithcing.
The more the threads the more the context switching and hence the time increases.