Search code examples
pythonlistperformancesimulationcalculation

How to make this simulation of which year a Solarflare hits the Earth faster?


I wrote this code and try to learn a bit more how to code more efficiently and increase performance.

import random

def CalcAverageSolarFlareEvent(EventList):
    return sum(EventList) / len(EventList)

percentage_solar_flare = 12
decade_counting = 0
Event = []
CurrentYear = 2022

for Simulations in range(1, 999999):
    while True:
        if random.randint(1, 100) != percentage_solar_flare:
            decade_counting += 1
        else:
            Event.append(decade_counting)
            decade_counting = 0
            break

print("In the Year "+str(int(CalcAverageSolarFlareEvent(Event))*10+CurrentYear) +
      " we got a Solarflare")

I tried to calculate the decade_counting and adding current year at the end, to give more ram.


Solution

  • Python is not great for such a code, especially the standard CPython implementation. Consider using PyPy or Pyston or an embedded JIT (just-in-time compiler) like Numba, or alternatively a compiled language.

    Moreover, you do not need to add items to a list so to count them or sum them: you can compute a partial sum on the fly.

    Here is a modified code using the Numba JIT:

    import random
    import numba as nb
    
    @nb.njit('()')
    def compute():
        percentage_solar_flare = 12
        decade_counting = 0
        sum_Events = 0
        count_Event = 0
        CurrentYear = 2022
    
        for Simulations in range(1, 999_999):
            while True:
                if random.randint(1, 100) != percentage_solar_flare:
                    decade_counting += 1
                else:
                    sum_Events += decade_counting
                    count_Event += 1
                    decade_counting = 0
                    break
    
        print("In the Year "+str(int(sum_Events/count_Event)*10+CurrentYear) +
              " we got a Solarflare")
    
    compute()
    

    The initial code takes 65 second on my machine while this one takes 4 seconds. Thus, it is about 16 times faster. Most of the time is spent in generating random values. SIMD instructions and multithreading can help to improve performance further (by 1~2 order of magnitude) but this is certainly not a good idea to use them is you are a beginner.