Search code examples
pythonpandasiminuit

Different output with equal input for python program when using iminuit: undefined behavior?


I encountered this wiered bugs that the iminuit cannot converge on a naive linear model. However, the real problem is, if I uncomment the line "#bins = np.linspace(0,4,25)", the result of the program is different, and it can converge.

If "same input" does not produce "same output", it means there is undefined behavior, or segmentation fault. Any ideas?

import numpy as np
import scipy as sp
import scipy.special
import probfit
import pandas as pd
data = pd.read_feather('test.feather').rho2.to_numpy()
print(data)
N,bins = np.histogram(data,bins=24,range=(0,4))
#bins = np.linspace(0,4,25)
print(bins)
x = (bins[:-1]+bins[1:])/2

exposure = 3.8061025098100147
def cost(y0,k):
    global x,exposure,N
    T = (y0+k*x)*exposure
    return -2*np.sum(N*np.log(T)-T-sp.special.loggamma(N+1))

import iminuit
minimizer = iminuit.Minuit(cost,errordef=1,y0=11,k=3,limit_y0=(0,None),limit_k=(0.1,None))
minimizer.migrad()
minimizer.hesse()
minimizer.minos()
display(minimizer.fmin, minimizer.params,minimizer.merrors)
minimizer.draw_mncontour("y0","k")

output: output

Test input

test.feather.zip


Solution

  • The bug is solved. Although it looks the same using print, the types are different:

    data = pd.read_feather('test.feather').rho2.to_numpy()
    N,bins = np.histogram(data,bins=24,range=(0,4))
    

    vs

    bins = np.linspace(0,4,25)
    

    The first returns float32, and the second returns float64. The iminuit need to calculate the gradient with numerical methods, so the precision of output of cost funciton need to be at least float64.

    Best solution is

    data = pd.read_feather('test.feather').rho2.to_numpy().astype('float64')