Search code examples
pythonpython-3.xvalidation

Where to put checks on the inputs of a class?


Where should I put checks on the inputs of a class. Right now I'm putting it in __init__ as follows, but I'm not sure if that's correct. See example below.

import numpy as np

class MedianTwoSortedArrays:
    def __init__(self, sorted_array1, sorted_array2):
        
        # check inputs --------------------------------------------------------
        # check if input arrays are np.ndarray's
        if isinstance(sorted_array1, np.ndarray) == False or \
            isinstance(sorted_array2, np.ndarray) == False:
            raise Exception("Input arrays need to be sorted np.ndarray's")
            
        # check if input arrays are 1D
        if len(sorted_array1.shape) > 1 or len(sorted_array2.shape) > 1:
            raise Exception("Input arrays need to be 1D np.ndarray's")
        
        # check if input arrays are sorted - note that this is O(n + m)
        for ind in range(0, len(sorted_array1)-1, 1):
            if sorted_array1[ind] > sorted_array1[ind + 1]:
                raise Exception("Input arrays need to be sorted")
        
        # end of input checks--------------------------------------------------
        
        self.sorted_array1 = sorted_array1
        self.sorted_array2 = sorted_array2

Solution

  • General Validation

    You generally have two opportunities to inspect the arguments passed to a constructor expression:

    • In the __new__ method, used for instance creation
    • In the __init__ method, used for instance initialization

    You should generally use __init__ for initialization and validation. Only use __new__ in cases where its unique characteristics are required. So, your checks are in the "correct" place.

    If you also want to validate any assignments to the instance variables that occur after initialization, you might find this question and its answers helpful.

    Validation in __new__

    One of the distinguishing characteristics of __new__ is that it is called before an instance of the relevant class is created. In fact, the whole purpose of __new__ is to create the instance. (This instance is then passed to __init__ for initialisation.)

    As stated in its documentation, "__new__() is intended mainly to allow subclasses of immutable types (like int, str, or tuple) to customize instance creation." Hence, you would likely include validation logic in __new__, rather than __init__, when subclassing an immutable type.

    Consider a simple example in which you want to create a subclass of tuple, called Point2D, that only allows the creation of instances containing 2 floats (whether it is sensible to subclass tuple for this purpose is another question):

    class Point2D(tuple):
        def __new__(cls, x, y):
            if not isinstance(x, (int, float)) or not isinstance(y, (int, float)):
                error = "The coordinates of a 2D point have to be numbers"
                raise TypeError(error)
    
            return super().__new__(cls, (float(x), float(y)))
    

    The documentation on __new__ states that "it is also commonly overridden in custom metaclasses in order to customize class creation." This use case, and numerous other use cases, are beyond the scope of this question. If you are still interested in the differences between __new__ and __init__, you might find these sources helpful:

    Exception Types

    Unrelated to the main question: If the arguments are invalid, the type of the raised exception should be as specific as possible. In your case:

    • If the arguments do not have the correct types, raise TypeError rather than just Exception.
    • If the arguments do not have the correct values (or shape), raise ValueError rather than just Exception.