Find the minimum element of the lower triangle

I use C++ as an additional language, but the implementation is required in Assembler. Given an 8x8 matrix, it is necessary to find the minimum number in the lower triangle of this matrix.

This is what I tried, but there is no result, an error occurs and I am generally not sure that the implementation is correct.

int main()
{

    int matrix[8][8] = {
        {5, 3, 7, 1, 9, 2, 8, 4},
        {2, 6, 3, 1, 8, 5, 7, 3},
        {8, 2, 5, 4, 1, 9, 6, 3},
        {4, 6, 8, 2, 5, 7, 9, 1},
        {7, 3, 2, 5, 8, 1, 4, 6},
        {9, 1, 4, 6, 2, 3, 7, 8},
        {3, 5, 9, 7, 4, 8, 2, 1},
        {6, 8, 1, 3, 7, 6, 5, 4}
    };

    int min_element;

 
    __asm {
        mov ecx, 8
        lea esi, matrix

        mov eax, [esi]
        mov min_element, eax

        outer_loop :
            add esi, 4

            inner_loop :
                mov eax, [esi]
                cmp eax, min_element
                jl update_min

                add esi, 4
                loop inner_loop

            jmp next_row

            update_min :
                mov min_element, eax

            next_row :
                sub esi, ecx
                add esi, 32
                loop outer_loop
    }

    std:cout << "Min element is: " << min_element << endl;

    return 0;
}

The error that occurs.

D:\Практичні\Lab4ASM\Debug\Lab4ASM.exe (process 12888) exited with code -1073741819.
Press any key to close this window . . .

This is what is meant by the lower triangle.

Solution

Problems

I am generally not sure that the implementation is correct.

Once you initialized the min_element, that very first add esi, 4 instruction already leaves the area of the lower triangle (so you should have stopped there for the first row of the matrix). You then immediately and erroneously process the rest of the first row plus the first element of the second row. At the conclusion of the inner_loop, the ECX register will hold zero, and so the sub esi, ecx take-back instruction won't take back anything. The add esi, 32 that follows will just skip 8 matrix elements. Hereafter, a loop outer_loop instruction confronted with an empty ECX register will start an endless loop.
In case the jl update_min is taken, things run somewhat differently, but still awfully wrong. You should not skip to the next row, since the remainder of the current row could have an even smaller element. And the take-back instruction sub esi, ecx in order to be of any use should be subtracting 4 times ECX since your matrix elements each have 4 bytes.

Solution

As an example, consider next 5x5 matrix

A, b, c, d, e
F, G, h, i, j
K, L, M, n, o
P, Q, R, S, t
U, V, W, X, Y

The elements in the lower triangle have been capitalized.
In memory you see the following:

process 1      process 2      process 3      process 4      process 5
<>             <--->          <------>       <--------->    <------------>
A, b, c, d, e, F, G, h, i, j, K, L, M, n, o, P, Q, R, S, t, U, V, W, X, Y
   <--------->       <------>          <--->             <>
   skip 4            skip 3            skip 2            skip 1            skip 0

For each row, the number of elements to process plus the number of elements to skip totals 5 (the dimension).

This is how you travers the matrix. We use the decreasing count of elements to skip to control the outer loop, and from it, we calculate the count for the inner loop because we have seen that process + skip equals dimension.

    lea  edi, matrix
    mov  esi, 5
    lea  edx, [esi - 1] ; Initial number of elements to skip (5 - 1)
OuterLoop:
    mov  ecx, esi       ; ESI is (fixed) dimension {5}
    sub  ecx, edx       ; EDX is number of elements to skip {4,3,2,1,0}
    ; Now ECX holds the number of elements in current row
    ; that belong to the lower triangle
InnerLoop:
    mov  ebx, [edi]
    add  edi, 4

    ...                 ; Here you check for the minimum

    dec  ecx
    jnz  InnerLoop
    lea  edi, [edi + edx * 4] ; Skip rest of this row
    sub  edx, 1               ; 4 -> 3, 3 -> 2, 2 -> 1, 1 -> 0, 0 -> -1
    jnb  OuterLoop            ; The 5th time we have a borrow, so we exit

This is how you find the minimum. Add these lines to the above code:

    mov  eax, [edi]           ; Initialize min_element

    ...

    cmp  ebx, eax             ; Current min_element
    jnl  NotBetter
    mov  eax, ebx             ; Better min_element
NotBetter:

Did you see how much nicer the code looks once you use the opposite conditional JNL NotBetter versus jl update_min?
If required, you could transfer the EAX contents to an actual memory-based variable like min_element.