I have a program which reads from a number of large files (~100s files, 120MB each) generated by another MPI
program, which can take some time. Each file contains the variables in their corresponding subdomain. I want to read in the variables from those files and store them into a specific slice of a 4 dimensional array. Since it takes some considerable amount of time, I would like to parallelize this piece of code with openmp
:
6 SUBROUTINE read_old_restart
7 INTEGER :: ii
8 INTEGER :: thread_ID
9 INTEGER :: OMP_GET_THREAD_NUM
10 CHARACTER(LEN=21) :: file_name
11
12 !$OMP PARALLEL DO PRIVATE(ii,file_name)
13 DO ii=0,Nproc_old-1
14 IF(ii < 10) THEN
15 WRITE(file_name,401) "input/Restart_00", ii, ".out"
16 ELSE IF(ii < 100) THEN
17 WRITE(file_name,402) "input/Restart_0" , ii, ".out"
18 ELSE
19 WRITE(file_name,403) "input/Restart_" , ii, ".out"
20 END IF
21 PRINT*, "Thread = ", OMP_GET_THREAD_NUM(), "Reading ", file_name
22 401 format(a16,I1,a4)
23 402 format(a15,I2,a4)
24 403 format(a14,I3,a4)
25 OPEN (unit=321, file=TRIM(file_name), status="old", form="unFORMATted")
26 READ(321) t , &
27 old_u (:,:,:,ii), &
28 old_v (:,:,:,ii), &
29 old_w (:,:,:,ii), &
30 old_p (:,:,:,ii), &
31 old_uc (:,:,:,ii), &
32 old_vc (:,:,:,ii), &
33 old_wc (:,:,:,ii), &
34 old_un2 (:,:,:,ii), &
35 old_vn2 (:,:,:,ii), &
36 old_wn2 (:,:,:,ii), &
37 old_un1 (:,:,:,ii), &
38 old_vn1 (:,:,:,ii), &
39 old_wn1 (:,:,:,ii), &
40 old_p1 (:,:,:,ii), &
41 old_viscu (:,:,:,ii), &
42 old_viscv (:,:,:,ii), &
43 old_viscw (:,:,:,ii), &
44 old_convu (:,:,:,ii), &
45 old_convv (:,:,:,ii), &
46 old_convw (:,:,:,ii), &
47 statindex , &
48 old_umn (:,:,:,ii), &
49 old_uumn (:,:,:,ii), &
50 old_urms (:,:,:,ii), &
51 old_mass_frac (:,:,:,:,:,ii), &
52 old_enthT (:,:,:,:,ii)
53 CLOSE (321)
54 END DO
55 !$OMP END PARALLEL DO
56 END SUBROUTINE read_old_restart
The code compiles and runs fine for the first loop of each thread. Here is the output:
Thread = 3 Reading input/Restart_030.out
Thread = 7 Reading input/Restart_067.out
Thread = 2 Reading input/Restart_020.out
Thread = 6 Reading input/Restart_058.out
Thread = 9 Reading input/Restart_085.out
Thread = 8 Reading input/Restart_076.out
Thread = 5 Reading input/Restart_049.out
Thread = 4 Reading input/Restart_040.out
Thread = 11 Reading input/Restart_103.out
Thread = 0 Reading input/Restart_000.out
Thread = 1 Reading input/Restart_010.out
Thread = 10 Reading input/Restart_094.out
The code appears to be running and gets stuck on the above output. When running top, I cannot any CPU usage. Any idea why is it not working as expected?
You should use a private integer variable for the unit number and set it to a different value for each thread. Using the same file unit differently from different threads is a recipe for trouble. I am quite surprised it does not crash.