Search code examples
pythonglobal-scope

How to properly reference a global variable in nested functions in Python


Let's say I have the following simple situation:

import pandas as pd

def multiply(row):
    global results
    results.append(row[0] * row[1])

def main():
    results = []
    df = pd.DataFrame([{'a': 1, 'b': 2}, {'a': 3, 'b': 4}, {'a': 5, 'b': 6}])
    df.apply(multiply, axis=1)
    print(results)

if __name__ == '__main__':
    main()

This results in the following traceback:

Traceback (most recent call last):

  File "<ipython-input-2-58ca95c5b364>", line 1, in <module>
    main()

  File "<ipython-input-1-9bb1bda9e141>", line 11, in main
    df.apply(multiply, axis=1)

  File "C:\Users\bbritten\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4262, in apply
    ignore_failures=ignore_failures)

  File "C:\Users\bbritten\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4358, in _apply_standard
    results[i] = func(v)

  File "<ipython-input-1-9bb1bda9e141>", line 5, in multiply
    results.append(row[0] * row[1])

NameError: ("name 'results' is not defined", 'occurred at index 0')

I know that I can move results = [] to the if statement to get this example to work, but is there a way to keep the structure I have now and make it work?


Solution

  • You must declare results outside the functions like:

    import pandas as pd
    
    results = []
    
    def multiply(row):
        # the rest of your code...
    

    UPDATE

    Also note that list in python is mutable, hence you don't need to specify it with global in the beginning of the functions. Example

    def multiply(row):
        # global results -> This is not necessary!
        results.append(row[0] * row[1])