Search code examples
wolfram-mathematicahessian-matrix

How to compile a function that computes the Hessian?


I am looking to see how a function that computes the Hessian of a log-likelihood can be compiled, so that it can be efficiently used with different sets of parameters.

Here is an example.

Suppose we have a function that computes the log-likelihood of a logit model, where y is vector and x is a matrix. beta is a vector of parameters.

 pLike[y_, x_, beta_] :=
 Module[
  {xbeta, logDen},
  xbeta = x.beta;
  logDen = Log[1.0 + Exp[xbeta]];
  Total[y*xbeta - logDen]
  ]

Given the following data, we can use it as follows

In[1]:= beta = {0.5, -1.0, 1.0};

In[2]:= xmat = 
  Table[Flatten[{1, 
     RandomVariate[NormalDistribution[0.0, 1.0], {2}]}], {500}];

In[3]:= xbeta = xmat.beta;

In[4]:= prob = Exp[xbeta]/(1.0 + Exp[xbeta]);

In[5]:= y = Map[RandomVariate[BernoulliDistribution[#]] &, prob] ;

In[6]:= Tally[y]

Out[6]= {{1, 313}, {0, 187}}

In[9]:= pLike[y, xmat, beta]

Out[9]= -272.721

We can write its hessian as follows

 hessian[y_, x_, z_] :=
  Module[{},
   D[pLike[y, x, z], {z, 2}]
  ]


In[10]:= z = {z1, z2, z3}

Out[10]= {z1, z2, z3}

In[11]:= AbsoluteTiming[hess = hessian[y, xmat, z];]

Out[11]= {0.1248040, Null}

In[12]:= AbsoluteTiming[
 Table[hess /. {z1 -> 0.0, z2 -> -0.5, z3 -> 0.8}, {100}];]

Out[12]= {14.3524600, Null}

For efficiency reasons, I can compile the original likelihood function as follows

pLikeC = Compile[{{y, _Real, 1}, {x, _Real, 2}, {beta, _Real, 1}},
   Module[
    {xbeta, logDen},
    xbeta = x.beta;
    logDen = Log[1.0 + Exp[xbeta]];
    Total[y*xbeta - logDen]
    ],
   CompilationTarget -> "C", Parallelization -> True,  
   RuntimeAttributes -> {Listable}
   ];

which yields the same answer as pLike

In[10]:= pLikeC[y, xmat, beta]

Out[10]= -272.721

I am looking for an easy way to obtain similarly, a compiled version of the hessian function, given my interest in evaluating it many times.


Solution

  • Here is an idea based on the previous post(s): We construct the input to Compile symbolically.

    mkCHessian[{y_, ys_Integer}, {x_, xs_Integer}, {beta_, bs_Integer}] :=
      With[{
       args = MapThread[{#1, _Real, #2} &, {{y, x, beta}, {1, 2, 1}}],
       yi = Quiet[Part[y, #] & /@ Range[ys]],
       xi = Quiet[Table[Part[x, i, j], {i, xs}, {j, xs}]],
       betai = Quiet[Part[beta, #] & /@ Range[bs]]
       },
      Print[args];
      Print[yi];
      Print[xi];
      Print[betai];
      Compile[Evaluate[args], 
       Evaluate[D[pLike[yi, xi, betai], {betai, 2}]]]
      ]
    

    And then generate the compiled function.

    cf = mkCHessian[{y, 3}, {x, 3}, {beta, 3}];
    

    You then call that compiled function

    cf[y, xmat, beta]
    

    Please verify that I did not make a mistake; in de Vries's post y is of length 2. Mine is length 3. I am sure what is correct. And of course, the Print are for illustration...


    Update
    A version with slightly improved dimension handling and with variables localized:

    ClearAll[mkCHessian];
    mkCHessian[ys_Integer, bs_Integer] :=
     Module[
       {beta, x, y, args, xi, yi, betai},
       args = MapThread[{#1, _Real, #2} &, {{y, x, beta}, {1, 2, 1}}];
       yi = Quiet[Part[y, #] & /@ Range[ys]];
       xi = Quiet[Table[Part[x, i, j], {i, ys}, {j, bs}]];
       betai = Quiet[Part[beta, #] & /@ Range[bs]];
       Compile[Evaluate[args], Evaluate[D[pLike[yi, xi, betai], {betai, 2}]]]
     ]
    

    Now, with asim's definitions in In[1] to In[5]:

    cf = mkCHessian[500, 3];
    cf[y, xmat, beta]
    
    (* ==> {{-8.852446923, -1.003365612, 1.66653381}, 
           {-1.003365612, -5.799363241, -1.277665283},
           {1.66653381, -1.277665283, -7.676551252}}  *)
    

    Since y is a random vector results will vary.