I've seen pretty much in all examples people first making an instance of a class:
ss = StandardScaler()
and only after that use the methods from the instance: ss.fit_transform(df)
, rather than calling the method on the class itself: StandardScaler().fit_transform(df)
.
Is this because of:
Thank you for the answers that raised many clarifying points, here's some as I understand them. Please correct me if I'm mistaken.
There are cases, which would throw an error otherwise.
There are cases, which don't throw an error, but produce different results (scary!)
It's ok to call on the class itself, if its used only once.
It's better to do just one thing on one line of code.
Aesthetics & opinion
Some other reason, please let me know!
np.random.default_rng(0).integers(10, size=(4,5))
import numpy as np
from sklearn.preprocessing import StandardScaler
# Here I'm using .interegs() without making an instance first
int_array1 = np.random.default_rng(0).integers(10, size=(4,5))
# Here I'm using .interegs() without making an instance first
int_array2 = StandardScaler().fit_transform(int_array1)
# This time instantiating before using for comparison
rng = np.random.default_rng(0)
int_array3 = rng.integers(10, size=(4,5))
ss = StandardScaler()
int_array4 = ss.fit_transform(int_array3)
print(int_array1)
print(int_array2)
print(int_array3)
print(int_array4)
Output has the same results regardless of instantiation.
[[8 6 5 2 3]
[0 0 0 1 8]
[6 9 5 6 9]
[7 6 5 5 9]]
[[ 0.88354126 0.22941573 0.57735027 -0.72760688 -1.70856429]
[-1.68676059 -1.60591014 -1.73205081 -1.21267813 0.30151134]
[ 0.2409658 1.14707867 0.57735027 1.21267813 0.70352647]
[ 0.56225353 0.22941573 0.57735027 0.72760688 0.70352647]]
[[8 6 5 2 3]
[0 0 0 1 8]
[6 9 5 6 9]
[7 6 5 5 9]]
[[ 0.88354126 0.22941573 0.57735027 -0.72760688 -1.70856429]
[-1.68676059 -1.60591014 -1.73205081 -1.21267813 0.30151134]
[ 0.2409658 1.14707867 0.57735027 1.21267813 0.70352647]
[ 0.56225353 0.22941573 0.57735027 0.72760688 0.70352647]]
You have a misunderstanding. Both versions of your code create an instance:
ss = StandardScaler()
ss.fit_transform(df)
as well as
StandardScaler().fit_transform(df)
The variable name (ss
) has nothing to do with creating an instance. The braces (()
) after the class name are responsible for creating the instance.
Code without creation of an instance would look like
StandardScaler.fit_transform(df)
# ^^ note the missing braces
We call such methods static.
Some other reason, please let me know!
You want an object live longer if it holds state, i.e. it's contents in the object that changes over time.
The example you posted is perfect for demonstrating this, you just didn't do it right:
import numpy as np
# No variable assignment
print(np.random.default_rng(0).integers(10, size=(1, 5)))
print(np.random.default_rng(0).integers(10, size=(1, 5)))
print("-"*10)
# Variable assignment
rng = np.random.default_rng(0)
print(rng.integers(10, size=(1, 5)))
print(rng.integers(10, size=(1, 5)))
That way you can demonstrate that the random number generator has state. And that state ensures that it generates new random numbers on the next call.
Possible output:
[[8 6 5 2 3]]
[[8 6 5 2 3]]
----------
[[8 6 5 2 3]]
[[0 0 0 1 8]]
There are cases, which would throw an error otherwise.
This should not happen. The code
ClassName().method()
also creates an instance, it just has no variable name assigned. It's like doing
temp = ClassName()
temp.method()
del temp # the variable is gone here
There are cases, which don't throw an error, but produce different results (scary!)
This should not happen for the same reason as before.
Prevents repetition of code (but it's ok, if its used only once.)
As you say: the constructor is run once when you assign the variable. If you need the variable more often and you don't assign a variable, this might result in constructors run multiple times.
When calling the same constructor over and over, DRY (the clean code principle "don't repeat yourself") becomes a reason, yes.
It's better to do just one thing on one line of code.
I don't think there's such a rule. List comprehensions are often the opposite.
Aesthetics & opinion.
Always :-)