Search code examples
fortransamplingweighted

Weighted sampling in Fortran


In a Fortran program I would like to choose at random a specific variable (specifically its index) by using weights. The weights would be provided in a separate vector (element 1 would contain weight of variable 1 and so on).

I have the following code who does the job without weight (mind being an integer vector with the index of each variable in the original dataset)

call rrand(xrand)
j = int(nn * xrand) + 1
mvar = mind(j)

Solution

  • Here are two examples. The first one is

    integer, parameter :: nn = 5
    real :: weight( nn ), cumsum( nn ), x
    
    weight( 1:nn ) = [ 1.0, 2.0, 5.0, 0.0, 2.0 ]
    
    do j = 1, nn
        cumsum( j ) = sum( weight( 1:j ) ) / sum( weight( 1:nn ) )   !! cumulative sum
    enddo
    
    x = rand()
    do j = 1, nn
        if ( x < cumsum( j ) ) exit
    enddo
    

    and the second one is taken from this page

    real :: sum_weight
    sum_weight = sum( weight( 1:nn ) )
    
    x = rand() * sum_weight
    do j = 1, nn
        if ( x < weight( j ) ) exit
        x = x - weight( j )
    enddo
    

    which is essentially the same as the first one. Both sample a random j from 1,2,...,5 with weight(j). 100000 trials give a distribution like

    j     :    1           2           3           4       5
    count :    10047       19879       50061       0       20013
    

    EDIT: A minimal test code is attached below (tested with gfortran-8/9):

    program main
        implicit none
        integer j, num( 5 ), loop
        real    weights( 5 )
    
        weights(:) = [ 1.0, 2.0, 5.0, 0.0, 2.0 ]
        num(:) = 0
    
        do loop = 1, 100000
            call random_index( j, weights )
            num( j ) = num( j ) + 1
        enddo
    
        do j = 1, size( weights )
            print *, j, num( j )
        enddo
    
    contains
    
    subroutine random_index( idx, weights )
        integer :: idx
        real, intent(in) :: weights(:)
    
        real x, wsum, prob
    
        wsum = sum( weights )
    
        call random_number( x )
    
        prob = 0
        do idx = 1, size( weights )
            prob = prob + weights( idx ) / wsum   !! 0 < prob < 1
            if ( x <= prob ) exit
        enddo
    end subroutine
    
    end program