I am trying to write Python functions that enforce type checking. The way I would attempt to do this is using assert
and isinstance()
in the first line of the function like so:
import numpy as np
import pandas as pd
array_like = Union[pd.core.series.Series, np.ndarray]
LOG_TRANSFORM_CONST = 1.01
def log_transform(feature: array_like) -> array_like:
assert isinstance(feature, array_like)
# First remove negative entries
feature[feature < 0.0] = 0.0
# Add a small constant to avoid NANs while applying logs
feature = feature + LOG_TRANSFORM_CONST
return np.log(feature)
This code does not work as you cannot use Union
along with isinstance()
. However, the following piece of code does work:
def log_transform(feature: array_like) -> array_like:
assert type(feature) in [pd.core.series.Series, np.ndarray]
# First remove negative entries
feature[feature < 0.0] = 0.0
# Add a small constant to avoid NANs while applying logs
feature = feature + LOG_TRANSFORM_CONST
return np.log(feature)
if __name__ == '__main__':
df = pd.DataFrame(columns=['A', 'B'])
df['A'] = [1, 2, 3, 4]
df['B'] = [10, 20, 30, 40]
tr_arr = log_transform(df.A)
print(tr_arr)
y = log_transform(np.array([2, 4, 6, 8, 10]))
print(y)
My question is whether this practice is advisable. What are best practices regarding type checking in Python? I know that one can install third-party libraries specifically for type checking, but I'm trying to avoid that.
Trying to check types with assertions has limitations. First, types are checked at runtime, so you don't catch the errors quickly, or even at all if the piece of code with the assertion is not executed. Second, some types cannot be checked using assertions; for example, you can't assert that a variable has a type "function from number to number".
Using a type-checking tool is the best option. You can try mypy; one of its core contributors is Guido van Rossum, so it's legitimate :D