How do I implement maximum likelihood estimation type 2?

I am trying to implement an empirical bayesian ML-II（Maximum likelihood estimation Type II）method for estimating prior distribution parameters from historical data

Where：

π(θ) is an expression for the prior distribution
p(x|θ) is is an expression for the data distribution
m(x) is an expression for the marginal distribution

According to the steps, I need to first integrate to find the expression of the marginal distribution, and then find the extreme value of this expression to estimate the parameters of the prior distribution. Extreme values can be achieved using methods such as scipy.optimize. So the question is how do we integrate this?

enter image description here

Solution

Here's a go at this using symfit. As an example I choose to sample from a bivariate normal distribution with no covariance.

import numpy as np
import matplotlib.pyplot as plt
from symfit import Model, Fit, Parameter, Variable, integrate, oo
from symfit.distributions import Gaussian
from symfit.core.objectives import LogLikelihood

# Make variables and parameters
x = Variable('x')
y = Variable('y')
m = Variable('m')
x0 = Parameter('x0', value=0.6, min=0.5, max=0.7)
sig_x = Parameter('sig_x', value=0.1)
y0 = Parameter('y0', value=0.7, min=0.6, max=0.9)
sig_y = Parameter('sig_y', value=0.05)

pdf = Gaussian(x=x, mu=x0, sig=sig_x) * Gaussian(x=y, mu=y0, sig=sig_y)
marginal = integrate(pdf, (y, -oo, oo), conds='none')
print(pdf)
print(marginal)

model = Model({m: marginal})

# Draw 10000 samples from a bivariate distribution
mean = [0.59, 0.8]
cov = [[0.11**2, 0], [0, 0.23**2]]
xdata, ydata = np.random.multivariate_normal(mean, cov, 10000).T

# We provide only xdata to the model
fit = Fit(model, xdata, objective=LogLikelihood)
fit_result = fit.execute()
print(fit_result)

xaxis = np.linspace(0, 1.0)
plt.hist(xdata, bins=100, density=True)
plt.plot(xaxis, model(x=xaxis, **fit_result.params).m)
plt.show()

This prints the following for the pdf and the marginal distribution:

>>> exp(-(-x0 + x)**2/(2*sig_x**2))*exp(-(-y0 + y)**2/(2*sig_y**2))/(2*pi*Abs(sig_x)*Abs(sig_y))
>>> sqrt(2)*sig_y*exp(-(-x0 + x)**2/(2*sig_x**2))/(2*sqrt(pi)*Abs(sig_x)*Abs(sig_y))

And for the fit results:

Parameter Value        Standard Deviation
sig_x     1.089585e-01 7.704533e-04
sig_y     5.000000e-02 nan
x0        5.905688e-01 -0.000000e+00
Fitting status message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
Number of iterations:   9
Regression Coefficient: nan

You can see that x0 and sig_x have been obtained properly, but no information can be obtained about the parameter to do with y. I think that makes sense in this example since there is no correlation, but I'll leave you to struggle with those kinds of details ;).