Search code examples
pythonnumpywarnings

mean, nanmean and warning: Mean of empty slice


Say I construct three numpy arrays:

a = np.array([1, 2, 3])
b = np.array([np.NaN, np.NaN, 3])
c = np.array([np.NaN, np.NaN, np.NaN])

Now I find that np.mean returns nan for both b and c:

>>> np.mean(a)
2.0
>>> np.mean(b)
nan
>>> np.mean(c)
nan

Since numpy 1.8 (released 20 April 2016), we've been blessed with nanmean, which ignores nan values:

>>> np.nanmean(a)
2.0
>>> np.nanmean(b)
3.0
>>> np.nanmean(c)
nan
C:\python-3.4.3\lib\site-packages\numpy\lib\nanfunctions.py:598: RuntimeWarning: Mean of empty slice
  warnings.warn("Mean of empty slice", RuntimeWarning)

So, nanmean is great, but it has the odd and undesirable behaviour of raising a warning when the array has nothing but nan values.

How can I get the behaviour of nanmean without that warning? I don't like warnings, and I don't like suppressing them manually.


Solution

  • I really can't see any good reason not to just suppress the warning.

    The safest way would be to use the warnings.catch_warnings context manager to suppress the warning only where you anticipate it occurring - that way you won't miss any additional RuntimeWarnings that might be unexpectedly raised in some other part of your code:

    import numpy as np
    import warnings
    
    x = np.ones((1000, 1000)) * np.nan
    
    # I expect to see RuntimeWarnings in this block
    with warnings.catch_warnings():
        warnings.simplefilter("ignore", category=RuntimeWarning)
        foo = np.nanmean(x, axis=1)
    

    @dawg's solution would also work, but ultimately any additional steps that you have to take in order to avoid computing np.nanmean on an array of all NaNs are going to incur some extra overhead that you could avoid by just suppressing the warning. Also your intent will be much more clearly reflected in the code.