Search code examples
pythonoptimizationopenmdao

OpenMDAO Metamodel is not respecting training data


I am using an OpenMDAO semi-structured metamodel as part of a Dymos optimisation. There are two input values, so the range is 2D. Usually, this works fine. However, I recently noticed that, for a certain set of training data, it does not properly interpolate.

I used the metamodel html visualisation tool to look at what was going on and could see the fit was wrong. Hovering over certain data points, I could see that they were displaying the input data values properly. However, move the cursor the slightest bit away from that point in any direction and the interpolation results was wildly different. This means the metamodel "fit" does not go through the training data points, or even close to it in some regions.

This issue was present when using the method 'slinear'. I switched it to use 'lagrange2' and the fit seems much better now. This method does seem to be very computationally expensive, though, as my optimisation has yet to complete and it has already been over 3 times the amount of time required using 'slinear'. Therefore, I would like to be able to go back to 'slinear'.

Does anyone have any insight into why this is happening and how to resolve the issue? All help is greatly appreciated. Thanks.


Solution

  • I'd need to see your data to have any guess of why the linear interpolation isn't working, so I can't comment much on that. Likely your semi-structured data has a hole in it near there and the linear interpolation is very poor.

    However, I can offer some advice for how to improve the speed. Semi-structured data is convenient to use sometimes, but it comes at a very large computational cost. If all you have is semi-structured data, but you want better speed you can consider re-interpolating the data onto a structured grid first, then passing that into a structured metamodel instead. There is a stand-alone interface for semi-structured metamodels that you can use to write a small script that will loop over a structured input grid and re-interpolate the data for you.

    You should use the 2D-slinear or 2D-lagrange2 options. These are the fast, fixed dimension options that will give the best performance.

    In general, I don't recommend using slinear interpolations for optimization, because the are not smooth. They do have the nice benefit of fitting the input data exactly, but the cost of that C1 discontinuity is pretty large on some problems (not all, but many). So some kind of smoothing is usually best.