Search code examples
pythonarraysnumpynonetypenumpy-ndarray

Why does numpy.ndarray allow for a "None" array?


I was wondering what is the rationale for the following functionality of numpy.ndarray:

>>> a = None
>>> a = np.asarray(a)
array(None, dtype=object)

>>> type(a)
<class 'numpy.ndarray'>

>>> a == None
True

>>> a is None
False

So in this case Python seems to actually create a None array (not array of Nones), which seems to enforce a type over variable a. But the documentation states that the positional argument needs to be "array_like":

a : array_like

Input data, in any form that can be converted to an array. This includes lists, lists of tuples, tuples, tuples of tuples, tuples of lists and ndarrays.

So why is None accepted as "array-like" since it is not any of the listed above?

By analogy, list(None) will return error because None is not "iterable" as per documentation.

Furthermore, some functions seem to actually return seemingly incorrect values. For example np.ndarray.argmax() or np.ndarray.argmin() actually return 0 for a "None array", but result in an error for an empty array which intuitively seems like the expected behaviour.

>>> a
array(None, dtype=object)
>>> b
array([], dtype=object)
>>> a.argmax()
0
>>> b.argmax()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: attempt to get argmax of an empty sequence

Is there actually any advantage to having a "None array" (array(None, dtype=object)) as opposed to an empty array (array([], dtype=object))?

Is this an intended functionality, or accidental consequence of Nones being actual objects? Could someone explain what's going on under the bonnet here and why?

Thanks a lot!


Solution

  • What you are getting with np.asarray(None) is an array with shape (), which is a scalar, with dtype object. You get something similar if you do np.asarray(2) or np.asarray('abc'). Scalars cannot be iterated but can be compared to non-NumPy values. At the same time, you get NumPy operations with them, so you can do:

    list(np.asarray(None).reshape((1,)))
    

    And it works.

    About functions like argmin or argmax. Note that a scalar is not empty. An array with shape () has one element, yet zero dimensions, while an array with shape (0,) has no elements but one dimension. This may be counterintuitive but it makes sense and makes things work too. As documented, argmin and argmax, when no axis value is given, work on the flattened array. The flattened array for a scalar (e.g. np.asarray(None).ravel()) is an array with shape (1,), and, since you are asking for the index of the smallest or greatest value and it only has one value, the answer is 0 in both cases. Interestingly, if you try np.argmin(np.asarray([None, None])) it fails, because now you have two elements and you need to compare them to know which one is the smallest, but you cannot compare None values.