Search code examples
zfit

zfit straight line fitting for 2 dim dataset


I would like to fit 2-dim plot by straight line (a*x+b) using zfit like the following figure.2_dim_plot example

That is very easy work by a probfit package, but it has been deprecated by scikit-hep. https://nbviewer.jupyter.org/github/scikit-hep/probfit/blob/master/tutorial/tutorial.ipynb

How can I fit such 2dim plots by any function? I've checked zfit examples, but it seems to be assumed some distribution (histogram) thus zfit requires dataset like 1d array and I couldn't reach how to pass 2d data to zfit.


Solution

  • There is no direct way in zfit currently to implement this out-of-the-box (with one line), since a corresponding loss is simply not added.

    However, the SimpleLoss (zfit.loss.SimpleLoss) allows you to construct any loss that you can think of (have a look at the example as well in the docstring). In your case, this would look along this:

    x = your_data
    y = your_targets  # y-value
    obs = zfit.Space('x', (lower, upper))
    
    param1 = zfit.Parameter(...)
    param2 = zfit.Parameter(...)
    ...
    model = Func(...)  # a function is the way to go here
    data = zfit.Data.from_numpy(array=x, obs=obs)
    
    def mse():
        prediction = model.func(data)
        value = tf.reduce_mean((prediction - y) ** 2)  # or whatever you want to have
        return value
    
    loss = zfit.loss.SimpleLoss(mse, [param1, param2])
    # etc.
    

    On another note, it would be a good idea to add such a loss. If you're interested to contribute I recommend to get in contact with the authors and they will gladly help you and guide you to it.

    UPDATE

    The loss function itself consists presumably of three to four things: x, y, a model and maybe an uncertainty on y. The chi2 loss looks like this:

    def chi2():
        y_pred = model.func(x)
        return tf.reduce_sum(((y_pred - y) / y_error) ** 2)
        
    loss = zfit.loss.SimpleLoss(chi2, model.get_params())
    
    

    That's all, 4 lines of code. x is a zfit.Data object, model is in this case a Func.

    Does that work? That's all.