I have an integer variable (declared implicitly to be INTEGER*4) that is being changed unexpectedly at one point in an overall looping structure from 51 to -1074038743. I have verified that it is only assigned in the one place where I am intending it to be.
I have placed WRITE statements in several strategic places and after this it is still unclear to me why it is being changed. It changes (in one iteration of the innermost loop) from one section of code to the other where the variable in question is only involved as a bound for a DO loop. I have verified that the program misbehaves in the same way on consecutive compilations/runs. [EDIT] this is just some pseudo code to outline the overall program structure.
DO 13 I = 1, NUM1
DO 40 J = 1, NUM2
NV = 51
.... some code
WRITE(*,*) NV
.... some code
WRITE(*,*) NV
.... some code
WRITE(*,*) NV
40 CONTINUE
13 CONTINUE
It is on the third WRITE statement that the value has been corrupted. This happens about 16 minutes into the run of the program (about 10% of the way through). Here is the relevant code between the second and third WRITE statements above.
DO 167 I8 = JNPR, INPR
WRITE(*,*)"in i8 loop,i8, NV= ", i8, NV
R12S = 0
ROVS = 0
STORE(M0) = E(I8)
M0 = M0 + 1
numb = inpr - jnpr + 1
WRITE(*,'(8/,A,i4,A,i4,A,8/)')'..on ',jvcnt,'of',numb,'cases..'
jvcnt = jvcnt + 1
WRITE(*,*)"HERE 21",NLS(I8,1),NLS(I8,2),I8
EB=E(NLS(I8,1))+E(i8)
WRITE(*,*)"HERE 22"
EA=E(NLS(I8,2))+E(i8)
E1=QABS(EB-EDALL)
E2=QABS(EA-EDALL)
W(1)=1/E1
W(2)=1/E2
W12=W(1)+W(2)
WP(1)=W(1)/W12
WP(2)=W(2)/W12
WP22=WP(2)**2
WP12=WP(1)**2
ED=QABS((WP(1)*E(NLS(I8,1))+WP(2)*E(NLS(I8,2)))-E(I8))
G1=NLS(i8,1)
G2=NLS(i8,2)
WRITE(17,988) "i8",i8,"EB",EB,"EA",EA,"E1",E1,"E2",E2,"W(1)",W(1)
1 ,"W(2)",W(2),"w12",w12,"WP(1)",WP(1),"E(NLS(I8,1))",G1,
1 "E(NLS(i8,2))",G2,"ED",ED
988 format(1x,a,i3,11(A,D26.19,/))
DO 169 I9 = 1,2
WRITE(*,*)"in i9 loop,i9, NV= ", i9, NV`
I'm not sure how to get a reproducible example (when I tried to the problem wasn't really happening). It seems to take place in the context of my code (which is about 6000 lines long), but I have verified that the integer variable in question was never re-assigned, so I was hoping someone might shed some general light on what could cause this.
I get notified by a segmentation fault because NV is used as an index, but my own troubleshooting this revealed to me that NV itself was the problem as described above.
The code fragment is probably insufficient to find the error.
What I do in these cases: compile the program with:
gfortran -O0 -g -Wall -pedantic -fcheck=all
and change code until all compile-warnings and run-time errors have disappeared.
If the program still misbehaves: install valgrind, and run the program under valgrind as in:
valgrind ./a.out
and analyze the output of valgrind. This has helped me a lot.