Numerically stable calculation of invariant mass in particle physics?

In particle physics, we have to compute the invariant mass a lot, which is for a two-body decay

When the momenta (p1, p2) are sometimes very large (up to a factor 1000 or more) compared to the masses (m1, m2). In that case, there is large cancellation happening between the last two terms when the calculation is carried out with floating point numbers on a computer.

What kind of numerical tricks can be used to compute this accurately for any inputs?

The question is about suitable numerical tricks to improve the accuracy of the calculation with floating point numbers, so the solution should be language-agnostic. For demonstration purposes, implementations in Python are preferred. Solutions which reformulate the problem and increase the amount of elementary operations are acceptable, but solutions which suggest to use other number types like decimal or multi-precision floating point numbers are not.

Note: The original question presented a simplified 1D dimensional problem in form of a Python expression, but the question is for the general case where the momenta are given in 3D dimensions. The question was reformulated in this way.

Solution

With a few tricks listed on Stackoverflow and the transformation described by Jakob Stark in his answer, it is possible to rewrite the equation into a form that does not suffer anymore from catastrophic cancellation.

The original question asked for a solution in 1D, which has a simple solution, but in practice, we need the formula in 3D and then the solution is more complicated. See this notebook for a full derivation.

Example implementation of numerically stable calculation in 3D in Python:

import numpy as np

# numerically stable implementation
@np.vectorize
def msq2(px1, py1, pz1, px2, py2, pz2, m1, m2):
    p1_sq = px1 ** 2 + py1 ** 2 + pz1 ** 2
    p2_sq = px2 ** 2 + py2 ** 2 + pz2 ** 2
    m1_sq = m1 ** 2
    m2_sq = m2 ** 2
    x1 = m1_sq / p1_sq
    x2 = m2_sq / p2_sq
    x = x1 + x2 + x1 * x2
    a = angle(px1, py1, pz1, px2, py2, pz2)
    cos_a = np.cos(a)
    if cos_a >= 0:
        y1 = (x + np.sin(a) ** 2) / (np.sqrt(x + 1) + cos_a) 
    else:
        y1 = -cos_a + np.sqrt(x + 1) 
    y2 = 2 * np.sqrt(p1_sq * p2_sq)
    return m1_sq + m2_sq + y1 * y2

# numerically stable calculation of angle
def angle(x1, y1, z1, x2, y2, z2):
    # cross product
    cx = y1 * z2 - y2 * z1
    cy = x1 * z2 - x2 * z1
    cz = x1 * y2 - x2 * y1
    
    # norm of cross product
    c = np.sqrt(cx * cx + cy * cy + cz * cz)
    
    # dot product
    d = x1 * x2 + y1 * y2 + z1 * z2
    
    return np.arctan2(c, d)

The numerically stable implementation can never produce a negative result, which is a commonly occurring problem with naive implementations, even in double precision.

Let's compare the numerically stable function with a naive implementation.

# naive implementation
def msq1(px1, py1, pz1, px2, py2, pz2, m1, m2):
    p1_sq = px1 ** 2 + py1 ** 2 + pz1 ** 2
    p2_sq = px2 ** 2 + py2 ** 2 + pz2 ** 2
    m1_sq = m1 ** 2
    m2_sq = m2 ** 2
    
    # energies of particles 1 and 2
    e1 = np.sqrt(p1_sq + m1_sq)
    e2 = np.sqrt(p2_sq + m2_sq)

    # dangerous cancelation in third term
    return m1_sq + m2_sq + 2 * (e1 * e2 - (px1 * px2 + py1 * py2 + pz1 * pz2))

For the following image, the momenta p1 and p2 are randomly picked from 1 to 1e5, the values m1 and m2 are randomly picked from 1e-5 to 1e5. All implementations get the input values in single precision. The reference in both cases is calculated with mpmath using the naive formula with 100 decimal places.

The naive implementation loses all accuracy for some inputs, while the numerically stable implementation does not.