I have a function that is taking up ~95% of my execution time, I use it multiple times and it's making my program incredibly slow. I cant use the built in multiprocessing library because of my reliance on Pycharm: https://youtrack.jetbrains.com/issue/PY-53235.
Here's my core problem:
import numpy as np
import time
import ray # confirmed this library works in my env
def get_matched_time(match: np.ndarray, template: np.ndarray) -> list:
"""
Return an array containing the closest value
in the template array to the input array.
Args:
match : for each value of match,
find the closest value in template.
template : values to match to
Returns:
matched_time : iterable of size match.size containing
values found in template closest to each value in match.
"""
matched_time, matched_index = [], []
for t in match: # for each and every match value
temp = [abs(t - valor) for valor in template] # iterate template value for closest time
matched_index.append(temp.index(min(temp))) #add the index
return [(template[idx]) for idx in matched_index]
(...)
The match
array (or any iterable) can vary between 10 and 12,000 values. The template array is usually upwards of 10,000 values as well, but always larger than match.
if __name__ == "__main__":
start = time.time()
returned_time = get_matched_time
end = time.time()
print(f"Execution time: {end - start} s")
>>>Execution time: 12.573657751083374 s
Just in-case its unclear:
match = [1.1, 2.1, 5.1] # will always be smaller than template
template = [1.0, 2.0, 3.0, 4.0, 5.0]
My desired output is [1.0, 2.0, 5.0]
in any iterable form, order doesn't really matter because I can just sort by value afterwords.
I'm hoping to be able to assign each loop of for t in match
to a process. I have been trying this with ray Ray Parallel Iterators but I don't understand the documentation to implement. If anybody can recommend a more efficient method, or a way to incorporate multiprocessing via ray
or another library, I would much appreciate it.
You can significantly improve the performance of your search by using vectorized numpy calls in place of your iterations. In particular, we can replace this code:
for t in match: # for each and every match value
temp = [abs(t - valor) for valor in template]
matched_index.append(temp.index(min(temp)))
With abundantly faster numpy operations. Rather than searching, we use vectorized maths to find all the absolute differences between match
and template
. Then we find the minima of these absolute values and the argument value of the minima. Finally, return the matches:
def get_matched_time(match: np.ndarray, template: np.ndarray) -> np.array:
match = match.reshape(-1, 1)
a = np.abs(match - template)
mins = np.argmin(a, axis=1)
return np.array([template[mins[i]] for i in range(len(match))])
In this code, I return an array since the inputs are arrays and this make the code interoperable with numpy functions that might be used on the ouputs.