I am completely new to python or any such programming language. I have some experience with Mathematica. I have a mathematical problem which though Mathematica solves with her own 'Parallelize' methods but leaves the system quite exhausted after using all the cores! I can barely use the machine during the run. Hence, I was looking for some coding alternative and found python kind of easy to learn and implement. So without further ado, let me tell you the mathematical problem and issues with my python code. As the full code is too long, let me give an outline.
1. Numericall solve a differential equation of the form y''(t) + f(t)y(t)=0, to get y(t) for some range, say C <= t <= D
2.Next, Interpolate the numerical result for some desired range to get the function: w(t), say for A <= t <= B
3. Using w(t), to solve another differential equation of the form z''(t) + [ a + b W(t)] z(t) =0 for some range of a and b, for which I am using the loop.
4. Deine F = 1 + sol1[157], to make a list like {a, b, F}. So let me give a prototype loop as this take most of the computation time.
for q in np.linspace(0.0, 4.0, 100):
for a in np.linspace(-2.0, 7.0, 100):
print('Solving for q = {}, a = {}'.format(q,a))
sol1 = odeint(fun, [1, 0], t, args=( a, q))[..., 0]
print(t[157])
F = 1 + sol1[157]
f1.write("{} {} {} \n".format(q, a, F))
f1.close()
Now, the real loop takes about 4 hrs and 30 minutes to complete (With some built-in functional form of w(t), it takes about 2 minute). When, I applied (without properly understanding what it does and how!) numba/autojit before the definition of fun in my code, the run time significantly improved and takes about 2 hrs and 30 minute. Also, writing two loops as itertools/product further reduces the run time by about 2 minutes only! However, Mathematica, when I let her use all the 4 cores, finishes the task within 30 minutes.
So, is there a way to improve the runtime in python?
To speed up python, you have three options:
Implementing multiprocessing - example using the prototype loop from the original question
I assume that the computations you do inside the nested loops in your prototype code are actually independent from one another. Since your prototype code is incomplete, I am not sure this is the case, however. Otherwise it will, of course, not work. I will give an example using not your differential equation problem for the fun function but a prototype of the same signature (input and output variables).
import numpy as np
import scipy.integrate
import multiprocessing as mp
def fun(y, t, b, c):
# replace this function with whatever function you want to work with
# (this one is the example function from the scipy docs for odeint)
theta, omega = y
dydt = [omega, -b*omega - c*np.sin(theta)]
return dydt
#definitions of work thread and write thread functions
def run_thread(input_queue, output_queue):
# run threads will pull tasks from the input_queue, push results into output_queue
while True:
try:
queueitem = input_queue.get(block = False)
if len(queueitem) == 3:
a, q, t = queueitem
sol1 = scipy.integrate.odeint(fun, [1, 0], t, args=( a, q))[..., 0]
F = 1 + sol1[157]
output_queue.put((q, a, F))
except Exception as e:
print(str(e))
print("Queue exhausted, terminating")
break
def write_thread(queue):
# write thread will pull results from output_queue, write them to outputfile.txt
f1 = open("outputfile.txt", "w")
while True:
try:
queueitem = queue.get(block = False)
if queueitem[0] == "TERMINATE":
f1.close()
break
else:
q, a, F = queueitem
print("{} {} {} \n".format(q, a, F))
f1.write("{} {} {} \n".format(q, a, F))
except:
# necessary since it will throw an error whenever output_queue is empty
pass
# define time point sequence
t = np.linspace(0, 10, 201)
# prepare input and output Queues
mpM = mp.Manager()
input_queue = mpM.Queue()
output_queue = mpM.Queue()
# prepare tasks, collect them in input_queue
for q in np.linspace(0.0, 4.0, 100):
for a in np.linspace(-2.0, 7.0, 100):
# Your computations as commented here will now happen in run_threads as defined above and created below
# print('Solving for q = {}, a = {}'.format(q,a))
# sol1 = scipy.integrate.odeint(fun, [1, 0], t, args=( a, q))[..., 0]
# print(t[157])
# F = 1 + sol1[157]
input_tupel = (a, q, t)
input_queue.put(input_tupel)
# create threads
thread_number = mp.cpu_count()
procs_list = [mp.Process(target = run_thread , args = (input_queue, output_queue)) for i in range(thread_number)]
write_proc = mp.Process(target = write_thread, args = (output_queue,))
# start threads
for proc in procs_list:
proc.start()
write_proc.start()
# wait for run_threads to finish
for proc in procs_list:
proc.join()
# terminate write_thread
output_queue.put(("TERMINATE",))
write_proc.join()
Explanation
run_thread
) that is run in the threads. This function computes individual problems until there are none left in the input Queue; it pushes the results into an output Queue.write_thread
) for collecting the results from the output queue and writing them into a file.Caveats