Search code examples
pythonclassinstantiation

Why make an instance of a class before using it?


Original question

I've seen pretty much in all examples people first making an instance of a class: ss = StandardScaler() and only after that use the methods from the instance: ss.fit_transform(df), rather than calling the method on the class itself: StandardScaler().fit_transform(df).

Is this because of:

  1. There are cases, which would throw an error otherwise.
  2. There are cases, which don't throw an error, but produce different results (scary!)
  3. Prevents repetition of code (but it's ok, if its used only once.)
  4. It's better to do just one thing on one line of code.
  5. Aesthetics & opinion.
  6. Some other reason, please let me know!

Some answers thus far

Thank you for the answers that raised many clarifying points, here's some as I understand them. Please correct me if I'm mistaken.

Potential reasons I suggested for making an instance first:

There are cases, which would throw an error otherwise.

  • Thomas Weller's answer below states that there shouldn't be, since calling the method on the class creates a temporary instance - it just doesn't get stored in a variable.

There are cases, which don't throw an error, but produce different results (scary!)

  • Thomas Weller's answer below states that there shouldn't be, since calling the method on the class creates a temporary instance.

It's ok to call on the class itself, if its used only once.

  • This seems to be true, as there is no reason to store the instance in a variable and repetition is not a problem.

It's better to do just one thing on one line of code.

  • Readability is more important than doing just one thing per line. In my opinion, both versions are just as clear and readable.

Aesthetics & opinion

  • There's some of these involved as well.

Some other reason, please let me know!

  • Of course object oriented programming is useful in many ways, but my question concerned only the isolated use of a class and a method someone else has already programmed for me.
  • My question wasn't concerned whether or not you can put parameters inside the class or the method - my example actually does this: np.random.default_rng(0).integers(10, size=(4,5))

Code Example

import numpy as np
from sklearn.preprocessing import StandardScaler

# Here I'm using .interegs() without making an instance first
int_array1 = np.random.default_rng(0).integers(10, size=(4,5))

# Here I'm using .interegs() without making an instance first
int_array2 = StandardScaler().fit_transform(int_array1)

# This time instantiating before using for comparison
rng = np.random.default_rng(0)
int_array3 = rng.integers(10, size=(4,5))
ss = StandardScaler()
int_array4 = ss.fit_transform(int_array3)

print(int_array1)
print(int_array2)
print(int_array3)
print(int_array4)

Output has the same results regardless of instantiation.

[[8 6 5 2 3]
 [0 0 0 1 8]
 [6 9 5 6 9]
 [7 6 5 5 9]]
[[ 0.88354126  0.22941573  0.57735027 -0.72760688 -1.70856429]
 [-1.68676059 -1.60591014 -1.73205081 -1.21267813  0.30151134]
 [ 0.2409658   1.14707867  0.57735027  1.21267813  0.70352647]
 [ 0.56225353  0.22941573  0.57735027  0.72760688  0.70352647]]
[[8 6 5 2 3]
 [0 0 0 1 8]
 [6 9 5 6 9]
 [7 6 5 5 9]]
[[ 0.88354126  0.22941573  0.57735027 -0.72760688 -1.70856429]
 [-1.68676059 -1.60591014 -1.73205081 -1.21267813  0.30151134]
 [ 0.2409658   1.14707867  0.57735027  1.21267813  0.70352647]
 [ 0.56225353  0.22941573  0.57735027  0.72760688  0.70352647]]

Solution

  • You have a misunderstanding. Both versions of your code create an instance:

    ss = StandardScaler()
    ss.fit_transform(df)
    

    as well as

    StandardScaler().fit_transform(df)
    

    The variable name (ss) has nothing to do with creating an instance. The braces (()) after the class name are responsible for creating the instance.

    Code without creation of an instance would look like

    StandardScaler.fit_transform(df)
    #             ^^ note the missing braces
    

    We call such methods static.

    Some other reason, please let me know!

    You want an object live longer if it holds state, i.e. it's contents in the object that changes over time.

    The example you posted is perfect for demonstrating this, you just didn't do it right:

    import numpy as np
    # No variable assignment
    print(np.random.default_rng(0).integers(10, size=(1, 5)))
    print(np.random.default_rng(0).integers(10, size=(1, 5)))
    print("-"*10)
    # Variable assignment
    rng = np.random.default_rng(0)
    print(rng.integers(10, size=(1, 5)))
    print(rng.integers(10, size=(1, 5)))
    

    That way you can demonstrate that the random number generator has state. And that state ensures that it generates new random numbers on the next call.

    Possible output:

    [[8 6 5 2 3]]
    [[8 6 5 2 3]]
    ----------
    [[8 6 5 2 3]]
    [[0 0 0 1 8]]
    

    There are cases, which would throw an error otherwise.

    This should not happen. The code

    ClassName().method()
    

    also creates an instance, it just has no variable name assigned. It's like doing

    temp = ClassName()
    temp.method()
    del temp            # the variable is gone here
    

    There are cases, which don't throw an error, but produce different results (scary!)

    This should not happen for the same reason as before.

    Prevents repetition of code (but it's ok, if its used only once.)

    As you say: the constructor is run once when you assign the variable. If you need the variable more often and you don't assign a variable, this might result in constructors run multiple times.

    When calling the same constructor over and over, DRY (the clean code principle "don't repeat yourself") becomes a reason, yes.

    It's better to do just one thing on one line of code.

    I don't think there's such a rule. List comprehensions are often the opposite.

    Aesthetics & opinion.

    Always :-)