I have some code that is slow (30-60mins by last count), that I need to optimize, it is a data extraction script for Abaqus for a structural engineering model. The worst part of the script is the loop where it iterates through the object model database first by frame (i.e. the time in the time history of the simulation) and nested under this it iterates by each of the nodes. The silly thing is that there are ~100k 'nodes' but only about ~20k useful nodes. But luckily for me the nodes are always in the same order, meaning I do not need to look up the node's uniqueLabel, I can do this in a separate loop once and then filter what I get at the end. That is why I have dumped everything into one list and then I remove all the nodes that are repeats. But as you can see from the code:
timeValues = []
peeqValues = []
for frame in frames: #760 loops
setValues = frame.fieldOutputs['@@@fieldOutputType'].getSubset
region=abaqusSet, position=ELEMENT_NODAL).values
timeValues.append(frame.frameValue)
for value in setValues: # 100k loops
peeqValues.append(value.data)
It still needs to make the value.data
calls unnecessarily, about ~80k times. If anyone is familiar with Abaqus odb (object database) objects, they're super slow under python. To add insult to injury they only run in a single thread, under Abaqus which has its own python version (2.6.x) and packages (so e.g. numpy is available, pandas is not). Another thing that may be annoying is the fact that you can address the objects by position e.g. frames[-1]
gives you the last frame, but you cannot slice, so e.g. you can't do this: for frame in frames[0:10]: # iterate first 10 elements
.
I don't have any experience with itertools but I'd want to provide it a list of nodeIDs (or list of True/False) to map onto the setValues. The length and pattern of setValues to skip is always the same for each of the 760 frames. Maybe something like:
for frame in frames: #still 760 calls
setValues = frame.fieldOutputs['@@@fieldOutputType'].getSubset(
region=abaqusSet, position=ELEMENT_NODAL).values
timeValues.append(frame.frameValue)
# nodeSet_IDs_TF = [True, True, False, False, False, ...] same length as
# setValues
filteredSetValues = ifilter(nodeSet_IDs_TF, setValues)
for value in filteredSetValues: # only 20k calls
peeqValues.append(value.data)
Any other tips also appreciated, after this I did want to "avoid the dots" by removing the .append()
from the loop, and then putting the whole thing in a function to see if it helps. The whole script already runs in under 1.5 hours (down from 6 and at one point 21 hours), but once you start optimizing there is no way to stop.
Memory considerations also appreciated, I run these on a cluster and I believe I got away once with 80 GB of RAM. The scripts definitely work on 160 GB, the issue is getting the resources allocated to me.
I've searched around for a solution but maybe I'm using the wrong keywords, I'm sure this is not an uncommon issue in looping.
EDIT 1
Here is what I ended up using:
# there is no compress under 2.6.x ... so use the equivalent recipe:
from itertools import izip
def compress(data, selectors):
# compress('ABCDEF', [1,0,1,0,1,1]) --> ACEF
return (d for d, s in izip(data, selectors) if s)
def iterateOdb(frames, selectors): # minor speed up
peeqValues = []
timeValues = []
append = peeqValues.append # minor speed up
for frame in frames:
setValues = frame.fieldOutputs['@@@fieldOutputType'].getSubset(
region=abaqusSet, position=ELEMENT_NODAL).values
timeValues.append(frame.frameValue)
for value in compress(setValues, selectors): # massive speed up
append(value.data)
return peeqValues, timeValues
peeqValues, timeValues = iterateOdb(frames, selectors)
The biggest improvement came from using the compress(values, selectors)
method (the whole script, including the odb portion went from ~1:30 hours to 25mins. There was also a minor improvement from append = peeqValues.append
as well as enclosing everything in def iterateOdb(frames, selectors):
.
I used tips from: https://wiki.python.org/moin/PythonSpeed/PerformanceTips
Thanks to everyone for answering & helping!
If you're not confident with itertools try using an if statement in your for loop first.
eg.
for index, item in enumerate(values):
if not selectors[index]:
continue
...
# where selectors is a truth array like nodeSet_IDs_TF
This way you can be more sure that you are getting the correct behaviour and will get getting most of the performance increase you would get from using itertools.
The itertools equivalent is compress
.
for item in compress(values, selectors):
...
I'm not familiar with abaqus, but the best optimisations you could achieve would be to see if there is anyway way to give abaqus your selectors so it doesn't have to waste creating each value, only for it to be thrown away. If abaqus is used for doing large array-based manipulations of data then it's like this is the case.