Clang OpenMP. Find max value in matrix N x N

I need to find max value in matrix using OpenMP. It is my first experience with OpenMP, previously I did this task using pthreads. I wrote this code but it does not work:

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>


void MatrixFIller(int nrows, int* m) {
    for (int i = 0; i < nrows; i++) {
        for (int j = 0; j < nrows; j++) {

            *(m + i * nrows + j) = rand() % 200;

        }
    }
};

#define dimension 9
#define number_of_threads 4

int main() {
    srand(time(NULL));
    int matrix[dimension][dimension];
    int local_max=-1;
    int final_max=-1;
    int j = 0;

    MatrixFIller(dimension, &matrix[0][0]);


    for (int i = 0; i < dimension; i++) {
        for (int j = 0; j < dimension; j++) {


            printf("%d\t", matrix[i][j]);
        }
        printf("\n");
    }

    omp_set_num_threads(number_of_threads);


#pragma omp parallel private(local_max)
    {
#pragma omp for 
        for (j = 0; j < dimension * dimension; j++) {
            if (*(matrix + (int)((j) / dimension) * dimension + (j - dimension * ((int)(j / dimension)))) > local_max) {

                local_max = *(matrix + (int)((j) / dimension) * dimension + (j - dimension * ((int)((j) / dimension))));

            }
         }


#pragma omp critical
        
        if (local_max > final_max) { final_max = local_max; };

    };

   

    printf("Max value of matrix with dimension %d is %d", dimension, final_max);

    };

The idea is that in pragma for each thread finds local max and after that it is compared with global max value in pragma critical. Why it does not correct? Thanks!

Solution

When entering the parallel region, local_max gets unitialized: the private clause creates variables that are local to each thread and that's it, they are not initialized to any value. If you want them to be initialized with the content of local_max had before entering the parallel region, you have to use the firstprivate clause instead.

However, it would actually be better to declare (and initialize) local_max inside the parallel region.

Also, you may have a look at the reduction clause (with the max option), which will make the code even simpler:

#pragma omp parallel for reduction(max:final_max)
    for (j = 0; j < dimension * dimension; j++) {
        if (*(matrix + (int)((j) / dimension) * dimension + (j - dimension * ((int)(j / dimension)))) > final_max) {
            final_max = *(matrix + (int)((j) / dimension) * dimension + (j - dimension * ((int)((j) / dimension))));
         }
    }

EDIT

Following Laci's comment about the incorrectness of the arithmetic: all of your indeces calculations look correct but are not easy to read. Since you have from the begining a 2D array it is simpler to set two loops. And possibly tell OpenMP to parallelize them both using the collapse clause (and by the way, and as far as possible, declare the loop indeces within the for(): this avoids always wondering which ones should be declared as private or not):

#pragma omp parallel for reduction(max:final_max) collapse(2)
    for (int i = 0; i < dimension; i++) {
        for (int j = 0; j < dimension; j++) {
            if (matrix[i][j] > final_max) {
                final_max = matrix[i][j];
            }
        }
    }