Search code examples
pythonnumpyint64

np.int64 behaves differently from int in math-operations


I have come across a very strange problem where i do a lot of math and the result is inf or nan when my input is of type <class 'numpy.int64'>, but i get the correct (checked analytically) results when my input is of type <class 'int'>. The only library functions i use are np.math.factorial(), np.sum() and np.array(). I also use a generator object to sum over series and the Boltzmann constant from scipy.constants.

My question is essentially this: Are their any known cases where np.int64 objects will behave very differently from int objects?

When i run with np.int64 input, i get the RuntimeWarnings: overflow encountered in long_scalars, divide by zero encountered in double_scalars and invalid value encountered in double_scalars. However, the largest number i plug into the factorial function is 36, and i don't get these warnings when i use int input.

Below is a code that reproduces the behaviour. I was unable to find out more exactly where it comes from.

import numpy as np
import scipy.constants as const

# Some representible numbers
sigma = np.array([1, 2])
sigma12 = 1.5
mole_weights = np.array([10,15])
T = 100
M1, M2 = mole_weights/np.sum(mole_weights)
m0 = np.sum(mole_weights)

fac = np.math.factorial

def summation(start, stop, func, args=None):
    #sum over the function func for all ints from start to and including stop, pass 'args' as additional arguments
    if args is not None:
        return sum(func(i, args) for i in range(start, stop + 1))
    else:
        return sum(func(i) for i in range(start, stop + 1))

def delta(i, j):
    #kronecker delta
    if i == j:
        return 1
    else:
        return 0

def w(l, r):
    # l,r are ints, return a float
    return 0.25 * (2 - ((1 / (l + 1)) * (1 + (-1) ** l))) * np.math.factorial(r + 1)

def omega(ij, l, r):
    # l, r are int, ij is and ID, returns float
    if ij in (1, 2):
        return sigma[ij - 1] ** 2 * np.sqrt(
            (np.pi * const.Boltzmann * T) / mole_weights[ij - 1]) * w(l, r)

    elif ij in (12, 21):
        return 0.5 * sigma12 ** 2 * np.sqrt(
            2 * np.pi * const.Boltzmann * T / (m0 * M1 * M2)) * w(l, r)
    else:
        raise ValueError('(' + str(ij) + ', ' + str(l) + ', ' + str(r) + ') are non-valid arguments for omega.')


def A_prime(p, q, r, l):
    '''
    p, q, r, l are ints. returns a float
    '''
    F = (M1 ** 2 + M2 ** 2) / (2 * M1 * M2)
    G = (M1 - M2) / M2

    def inner(w, args):
        i, k = args
        return ((8 ** i * fac(p + q - 2 * i - w) * (-1) ** (r + i) * fac(r + 1) * fac(
            2 * (p + q + 2 - i - w)) * 2 ** (2 * r) * F ** (i - k) * G ** w) /
                (fac(p - i - w) * fac(q - i - w) * fac(r - i) * fac(p + q + 1 - i - r - w) * fac(2 * r + 2) * fac(
                    p + q + 2 - i - w)
                 * 4 ** (p + q + 1) * fac(k) * fac(i - k) * fac(w))) * (
                       2 ** (2 * w - 1) * M1 ** i * M2 ** (p + q - i - w)) * 2 * (
                       M1 * (p + q + 1 - i - r - w) * delta(k, l) - M2 * (r - i) * delta(k, l - 1))

    def sum_w(k, i):
        return summation(0, min(p, q, p + q + 1 - r) - i, inner, args=(i, k))

    def sum_k(i):
        return summation(l - 1, min(l, i), sum_w, args=i)

    return summation(l - 1, min(p, q, r, p + q + 1 - r), sum_k)

def H_i(p, q):
    '''
    p, q are ints. Returns a float
    '''

    def inner(r, l):
        return A_prime(p, q, r, l) * omega(12, l, r)

    def sum_r(l):
        return summation(l, p + q + 2 - l, inner, args=l)

    val = 8 * summation(1, min(p, q) + 1, sum_r)

    return val

p, q = np.int64(8), np.int64(8)

print(H_i(p,q)) #nan
print(H_i(int(p) ,int(q))) #1.3480582058153066e-08

Solution

    • Numpy's int64 is a 64-bit integer, meaning it consists of 64 places that are either 0 or 1. Thus the smallest representable value is -2**63 and the biggest one is 2**63 - 1
    • Python's int is essentially unlimited in length, so it can represent any value. It is equivalent to a BigInteger in Java. It's stored as a list of int64s essentially that are considered a single large number.

    What you have here is a classic integer overflow. You mentioned that you "only" plug 36 into the factorial function, but the factorial function grows very fast, and 36! = 3.7e41 > 9.2e18 = 2**63 - 1, so you get a number bigger than you can represent in an int64!

    Since int64s are also called longs this is exactly what the warning overflow encountered in long_scalars is trying to tell you!