Householder QR-factorization using Lapack and Fortran

I have a problem with QR-factorization using dgeqrf with my fortran code. I need to create Q and R matrices, so that A = Q * R and then compute err=||A - Q*R|| / ||A||, where || || is a Frobenius norm (same to Euclidian for two-dimensional arrays). I've read about dgeqrf here https://www.netlib.org/lapack/explore-html/dd/d9a/group__double_g_ecomputational_ga3766ea903391b5cf9008132f7440ec7b.html. The mistake is definitely somewhere in my usage of the function, because I have a similar code with my own QR factrorization, where everything works. Please, help me.I think that there is a mistake in the program because err should be somewhere around 10^(-15) (here it is somewhere around 10 ^(0)), that is because A should be equal to Q * R with a small error.

program terra
    implicit none
    integer(4) :: n, diag = 1, h, wdth, u
    real(8), dimension(:, :), allocatable :: A, Q, R, M_aux
    real(8), dimension(:), allocatable :: arr, aux
    real(8) :: err
    n = 1024
    allocate (A(n,n), Q(n,n), R(n, n), M_aux(n, n), arr(n), aux(n))

    !filling matrices
    do h = 1, n
        do wdth = 1, n
            A(h, wdth) = 1 / real(h + wdth, 8)
            R(h,wdth) = A(h,wdth)
            Q(h, wdth) = 0
        end do
        Q(diag, diag) = 1
        diag = diag + 1
    end do

    !doing the factrorisation
    call dgeqrf(n, n, R, n , arr, aux, n, u)

    !getting Q matrix
    do h = 2, n
        diag = h - 1
        do wdth = 1, diag
            Q(h, wdth) = R(h, wdth)
            R(h, wdth) = 0
            !write(*, *) h, wdth, R(h,wdth), Q(h, wdth)
        end do
    end do

    !Transposing it
    do wdth = 1, n
        do h = 1, n
            M_aux(wdth, h) = Q(wdth, h)
        end do
    end do
    do wdth = 1, n
        do h = 1, n
            Q(wdth, h) = M_aux(h, wdth)
        end do
    end do

    !computing err
    do h = 1, n
        do wdth = 1, n
            M_aux(h, wdth) = 0
            do diag = 1, n
                M_aux(h, wdth) = M_aux(h, wdth) + Q(h, diag) * R(diag, wdth)
            end do
        end do
    end do
    M_aux = M_aux - A
    err = norm2(M_aux) / norm2(A)
    write(*,*) err
end program

Solution

Here's the answer I wanted to give last night when I wasn't feeling up to much typing. It's great you've worked out your immediate problem (needing to use DORG2R) but I also wanted to point out a few other issues

Real( 8 ) and Integer( 4 ) are not portable and do not work with all compilers. The code below shows a portable way
Your loops were the wrong way around for a column major language like Fortran. As ordered in the code below the memory access patter is better, and the loops will be faster
Nobody should write a matrix multiply using loops nowadays. Simpler and better is to use the Fortran intrinsic Matmul, though if you really want performance BLAS is still probably your best bet. I show both below.

Anyway here is what I have

ian@eris:~/work/stack$ cat qr.f90
Program terra

  Use, Intrinsic :: iso_fortran_env, Only :  wp => real64
  
  Implicit None

  Integer :: n, h, wdth, u
  Real(wp), Dimension(:, :), Allocatable :: A, Q, R, M_aux
  Real(wp), Dimension(:), Allocatable :: arr, aux, work
  Real(wp) :: err
  ! Smaller for quick testing
  n = 1024
!!$  n = 200
  Allocate (A(n,n), Q(n,n), R(n, n), M_aux(n, n), arr(n), aux(n), work(n))

  !filling matrices
  ! Simple array syntax to fill Q
  Q = 0.0_wp
  ! Order loops for best memory access order in a
  ! column major language like Fortran
  Do wdth = 1, n
     Do h = 1, n
        A(h, wdth) = 1.0_wp / Real(h + wdth, Kind( A ) )
        R(h,wdth) = A(h,wdth)
     End Do
  End Do

  !doing the factrorization
  Call dgeqrf(n, n, R, n , arr, aux, n, u)
  Q = R

  !getting R matrix
  ! Order loops for best memory access order in a
  ! column major language like Fortran
  Do wdth = 1, n
     Do h = wdth + 1, n
        R(h, wdth) = 0.0_wp
     End Do
  End Do

  !getting Q matrix
  Call DORG2R(n, n, n, Q, n, arr, aux, u)

  ! Find error using Matmul - probably quicker than simple loops
  M_aux = Matmul( Q, R )
  M_aux = M_aux - A
  err = Norm2(M_aux) / Norm2(A)
  Write(*,*) 'Error, matmul ', err

  ! Find error using BLAS - almost certainly quicker than simple loops
  M_aux = Matmul( Q, R )
  Call dgemm( 'N', 'N', n, n, n, 1.0_wp, Q    , Size( Q    , Dim = 1 ), &
                                         R    , Size( R    , Dim = 1 ), &
                                 0.0_wp, M_Aux, Size( M_aux, Dim = 1 ) )
  M_aux = M_aux - A
  err = Norm2(M_aux) / Norm2(A)
  Write(*,*) 'Error, Blas ', err

End Program terra
  
ian@eris:~/work/stack$ gfortran --version
GNU Fortran (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
ian@eris:~/work/stack$ gfortran -Wall -Wextra -fcheck=all -g -std=f2008 -Wall qr.f90  -llapack -lblas
ian@eris:~/work/stack$ ./a.out
 Error, matmul    8.2474910291356388E-016
 Error, Blas    8.9420732364900517E-016