Search code examples
rfunctionstatisticsproduction

Summation inside summation inside production in R


I have problems with the coding of a function to optimize in which there are two summations and one production, all with different indexing. I split the code into two functions for simplicity.

In the first function j goes from 0 to k:

w = function(n,k,gam){
  j = 0:k
  w = (1 / factorial(k)) * n * sum(choose(k, j * gam))
  return(w)}

In the second function k goes from 0 to n (that is fixed to 10); instead the production goes from 1 to length(x):

    f = function(gam,del){
   x = mydata #vector of 500 elements
   n = 10    
 k = 0:10  
 for (i in 0:10)
        pdf = prod( sum( w(n, k[i], gam) * (1 / del + (n/x)^(n+1))
return(-pdf)}

When I try the function I obtain the following error:

Error in 0:k : argument of length 0

Edit: This is what I am tryig to code

enter image description here

where I want to maximize L(d,g) using optim and:

enter image description here

and n is fixed to a specific value.


Solution

  • Solution

    Change for (i in 0:10) to for ( i in 1:11 ). Note: When I copied and ran your code I also noticed some unrelated bracket/parentheses omissions you may need to fix also.

    Explanation

    Your problem is that R uses a 1-based indexing system rather than a 0-based one like many other programming languages or some mathematical formulae. If you run the following code you'll get the same error, and it pinpoints the problem:

    k = 0:10
    for ( i in 0:10 ) {
        print(0:k[i])
    }
    
    Error in 0:k[i] : argument of length 0
    

    You get an error on the first iteration because there is no 0 element of k. Compare that to the following loop:

    k = 0:10
    for ( i in 1:11 ) {
        print(0:k[i])
    }
    
    [1] 0
    [1] 0 1
    [1] 0 1 2
    [1] 0 1 2 3
    [1] 0 1 2 3 4
    [1] 0 1 2 3 4 5
    [1] 0 1 2 3 4 5 6
    [1] 0 1 2 3 4 5 6 7
    [1] 0 1 2 3 4 5 6 7 8
     [1] 0 1 2 3 4 5 6 7 8 9
     [1]  0  1  2  3  4  5  6  7  8  9 10
    

    Update

    Your comment to the answer clarifies some additional information you need:

    Just to full understand everything, how do I know in a situation like this that R is indexing the production on x and the summation on k?

    The short answer is that it depends on how you nest your loops and function calls. In more detail:

    • When you call f(), you start a for loop over the elements of k, so R is indexing the block of code within the for loop (everything in the braces in my re-formatted version of f() below) "on" k. For every element in k, you assign prod(...) to pdf (Side note: I don't know why you're re-writing over pdf in every iteration of this loop)
    • sum( w(n, k[i], gam) * gamma(1 / del + k[i]) * s^(n + 1)) produces a vector of length max(length(w(n, k[i], gam)), length(s)) (side note: Beware of recycling! -- see Section 2.2 of "An Introduction to R"); prod(sum( w(n, k[i], gam) * gamma(1 / del + k[i]) * s^(n + 1))) effectively indexes over the elements of that vector
    • w(n, k[i], gam) * gamma(1 / del + k[i]) * s^(n + 1) produces a vector of length max(length(w(n, k[i], gam)), length(s)); sum( w(n, k[i], gam) * gamma(1 / del + k[i]) * s^(n + 1)) effectively indexes over the elements of that vector
    • Etc.

    What you're indexing over, explicitly or implicitly through vectorized operations, depends on which level of nested loops or function calls you're talking about. You may need some careful thinking and planning about when you want to index over what, which will tell you how you need to nest things. Put the operation whose indices should vary fastest on the innermost call. For example, in effect, prod(1:3 + sum(1:3)) will index over sum(1:3) to produce that sum first then index over 1:3 + sum(1:3) to produce the product. I.e., sum(1:3) = 1 + 2 + 3 = 6, then prod(1:3 + sum(1:3)) = (1 + 6) * (2 + 6) * (3 + 6) = 7 * 8 * 9 = 504. It's just like how parentheses work in mathematics.

    Also, another side note, I wouldn't refer to global variables from within a function as you do in f() -- I've highlighted below in your code where you do that and offered an alternative that doesn't do it.

    f = function(gam, del){
        x = mydata # don't refer to a global variable "mydata", make it an argument
        n = 10  
        s = n / x  
        k = 1:11
        for (i in 1:11){
            pdf = prod( sum( w(n, k[i], gam) * gamma(1 / del + k[i]) * s^(n + 1)))
        }
        return(-pdf)
    }
    
    # Do this instead
    # (though there are still other things to fix,
    # like re-writing over "pdf" eleven times and only using the last value)
    
    f = function(gam, del, x, n = 10) {
        s = n / x
        s = n / x  
        k = 0:10
        for (i in 1:11){
            pdf = prod( sum( w(n, k[i], gam) * gamma(1 / del + k[i]) * s^(n + 1)))
        }
        return(-pdf)
    }