I have data of the form
n = number of samples
features: n x 1 matrix
data: n x m matrix
I want to perform multiple Poisson regressions with the same features
, where the output values vary across the columns of data. Currently, I do one Poisson regression at a time using sklearn
, eg my Python code looks something like
from sklearn import linear_model
clf = linear_model.PoissonRegressor(fit_intercept=True,alpha=0)
for col in range(m):
clf.fit(features,data[:,col])
However I have to do many of these Poisson regressions and it is way too slow to do them all individually. So my question is: is there a way (in Python) that I can simultaneously do all m of these Poisson regressions at once?
If I were doing linear regression instead, then I could use nice matrix tricks to do these simultaneously. However the key difference here is that the Poisson regression involves using an optimization algorithm to maximize a likelihood function. So essentially, I would like to solve multiple optimization problems at once.
One thing I tried was to use scipy.optimize
to maximize the sum of the (log) likelihoods for each Poisson regression. However this was incredibly sensitive to the initialization and did not converge.
Thus I am hoping there is either:
Does anyone have any ideas? Any help would be greatly appreciated. Thank you!
It looks like you are looking for multiprocessing.
It will enable you to run N Poisson regression calculation at the same time.
A simple example below:
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
with Pool(5) as p:
print(p.map(f, [1, 2, 3]))