Search code examples
pythonpass-by-referencepass-by-value

Should I ever return a list that was passed by reference and modified?


I have recently discovered that lists in python are automatically passed by reference (unless the notation array[:] is used). For example, these two functions do the same thing:

def foo(z):
    z.append(3)

def bar(z):
    z.append(3)
    return z

x = [1, 2]
y = [1, 2]
foo(x)
bar(y)
print(x, y)

Before now, I always returned arrays that I manipulated, because I thought I had to. Now, I understand it's superfluous (and perhaps inefficient), but it seems like returning values is generally good practice for code readability. My question is, are there any issues for doing either of these methods/ what are the best practices? Is there a third option that I am missing? I'm sorry if this has been asked before but I couldn't find anything that really answers my question.


Solution

  • This answer works on the assumption that the decision as to whether to modify your input in-place or return a copy has already been made.

    As you noted, whether or not to return a modified object is a matter of opinion, since the result is functionally equivalent. In general, it is considered good form to not return a list that is modified in-place. According to the Zen of Python (item #2):

    Explicit is better than implicit.

    This is borne out in the standard library. List methods are notorious for this on SO: list.append, insert, extend, list.sort, etc.

    Numpy also uses this pattern frequently, since it often deals with large data sets that would be impractical to copy and return. A common example is the array method numpy.ndarray.sort, not to be confused with the top-level function numpy.sort, which returns a new copy.

    The idea is something that is very much a part of the Python way of thinking. Here is an excerpt from Guido's email that explains the whys and wherefors:

    I find the chaining form a threat to readability; it requires that the reader must be intimately familiar with each of the methods. The second [unchained] form makes it clear that each of these calls acts on the same object, and so even if you don't know the class and its methods very well, you can understand that the second and third call are applied to x (and that all calls are made for their side-effects), and not to something else.