Search code examples
pythonloggingtimepython-requestsglobal-variables

Alternative to global variables when logging stats about requests


I have a program that logs some messages about data that I download. Besides that, I would like to display some stats about the requests with every k-requests that I make to a site (k is 10 in my case) + some overall stats at the end of the execution.

At the moment I have an implementation that I am not happy with, as it uses global variables. I am looking for a cleaner alternative. It looks like this (Note: please ignore the fact that I am using print instead of logging and that I am measuring the passing of time using time.time instead of time.perf_counter (read here that the latter would be a better option):

import time
import pprint

def f2(*args, **kwargs):
    global START_TIME

    global NO_REQUESTS
    global TOTAL_TIME_FOR_REQUESTS
    global MAX_TIME_FOR_REQUEST
    global AVERAGE_TIME_FOR_REQUESTS

    global TOTAL_TIME_FOR_DECODING
    global TOTAL_TIME_FOR_INTERSECT
    
    # ... logic that changes values of most of these global variables

    if NO_REQUESTS % 10 == 0:
        AVERAGE_TIME_FOR_REQUESTS = TOTAL_TIME_FOR_REQUESTS / NO_REQUESTS
        print()
        print('no requests so far: ' + str(NO_REQUESTS))
        print('average request time: {:.2f}s'.format(AVERAGE_TIME_FOR_REQUESTS))
        print('max request time: {:.2f}s'.format(MAX_TIME_FOR_REQUEST))
                    
        elapsed = time.time() - START_TIME
        hours_elapsed = elapsed // 3600
        minutes_elapsed = (elapsed % 3600) // 60
        seconds_elapsed = ((elapsed % 3600) % 60)
        print('time elapsed so far: {}h {}m {:.2f}s'.format(hours_elapsed, minutes_elapsed, seconds_elapsed))
        print()

        time5 = time.time()
        decoded = some_module.decode(res.content)
        time6 = time.time()

        elapsed2 = time6 - time5
        TOTAL_TIME_FOR_DECODING += elapsed2

    return something


def f1(*args, **kwargs):

    global START_TIME

    global TOTAL_TIME_FOR_REQUESTS
    TOTAL_TIME_FOR_REQUESTS = 0
    global MAX_TIME_FOR_REQUEST
    MAX_TIME_FOR_REQUEST = 0
    global NO_REQUESTS
    NO_REQUESTS = 0
    global AVERAGE_TIME_FOR_REQUESTS
    AVERAGE_TIME_FOR_REQUESTS = 0

    global TOTAL_TIME_FOR_DECODING
    TOTAL_TIME_FOR_DECODING = 0
    global TOTAL_TIME_FOR_INTERSECT
    TOTAL_TIME_FOR_INTERSECT = 0

    f2() # notice call to other function!

    # ... some logic
        
    return some_results


def output_final_stats(elapsed, results, precision='{:.3f}'):
    print()
    print('=============================')

    hours_elapsed = elapsed // 3600
    minutes_elapsed = (elapsed % 3600) // 60
    seconds_elapsed = ((elapsed % 3600) % 60)
    print("TIME ELAPSED: {:.3f}s OR {}h {}m {:.3f}s".format(
        elapsed, hours_elapsed, minutes_elapsed, seconds_elapsed))

    print("out of which:")
    # print((precision+'s for requests)'.format(TOTAL_TIME_FOR_REQUESTS)))
    print('{:.3f}s for requests'.format(TOTAL_TIME_FOR_REQUESTS))
    print('{:.3f}s for decoding'.format(TOTAL_TIME_FOR_DECODING))
    print('{:.3f}s for intersect'.format(TOTAL_TIME_FOR_INTERSECT))

    total = TOTAL_TIME_FOR_REQUESTS + TOTAL_TIME_FOR_DECODING + TOTAL_TIME_FOR_INTERSECT
    print('EXPECTED: {:.3f}s'.format(total))
    print('DIFF: {:.3f}s'.format(elapsed - total))
    print()
    print('AVERAGE REQUEST TIME: {:.3f}s'.format(AVERAGE_TIME_FOR_REQUESTS))
    print('TOTAL NO. REQUESTS: ' + str(NO_REQUESTS))
    print('MAX REQUEST TIME: {:.3f}s'.format(MAX_TIME_FOR_REQUEST))
    print('TOTAL NO. RESULTS: ' + str(len(results)))
    pprint('RESULTS: {}'.format(results), indent=4)


if __name__ == '__main__':

    START_TIME = time.time()
    results = f1(some_params)
    final_time = time.time()

    elapsed = final_time - START_TIME
    output_final_stats(elapsed, results)


The way I thought of it (not sure if the best option, open to alternatives) is to somehow have a listener on the NO_REQUESTS variable and whenever that number reaches a multiple of 10 trigger the logging of the variables that I am interested in. Nonetheless, where would I store those variables, what would be their namespace?

Another alternative would be to maybe have a parametrised decorator for one of my functions, but in this case I am not sure how easy it would be to pass the values that I am interested in from one function to another.


Solution

  • I think the cleanest way is to use a parametrized class decorator.

    class LogEveryN:
        def __init__(self, n=10):
            self.n = n
            self.number_of_requests = 0
            self.total_time_for_requests = 0
            self.max_time_for_request = 0
            self.average_time_for_request = 0
    
        def __call__(self, func, *args, **kwargs):
            def wrapper(*args, **kwargs):
                self.number_of_request += 1
    
                if self.number_of_request % self.n:
                    # Do your computation and logging
    
                return func(*args, **kwargs)
            return wrapper
    
    @LogEveryN(n=5)
    def request_function():
        pass