Optimization of expensive model with many parameters

Question

I have a physical model which takes $sim50$ parameters and gives $sim2000$ outputs taking tens of minutes to run. I need to optimize these parameters to give outputs as close as possible to data. The problem of course is that it is expensive to evaluate and, probably worse, that there are so many parameters.
The best suggestion I have found so far is to use some kind of surrogate model and optimize that instead. However, these surrogate models are always (as far as I can see) for functions with just one output, i.e. the cost function. This is of course still an option here as I need some way to quantify how good the model is, so I am trying to minimise the $chi^2$. Then I can for example use Bayesian optimization or a quadratic surrogate to optimize it.
My issue with this is that $chi^2$ is like the 'distance' between the model result and the data in the high dimensional output space. This feels like throwing away a huge amount of information, as any optimization method based on just the cost function is only using information about this distance, and not the actual behaviour of the model. Being a physical model, certain parameters affect the outputs in particular ways, and one can fit to the data to some extent by hand. This is done without any reference to the $chi^2$ explicitly, but being a human means it will not be perfect. It also feels similar to an 'inverse problem', where one tries to find the most likely parameters for given data.
My questions are then: does there perhaps exist a way to create some kind of surrogate for the full model rather than just for the $chi^2$ in order to replicate the insights one uses when searching by hand instead of just looking at the 'distance'? Even putting the optimization problem aside, this would still be extremely helpful in seeing how different parameters affect the output, giving a better understanding of the physics, but I fear using something like machine learning would require too great a number of evaluations. Then, regarding just the optimization problem, even if there does exist a way of creating such a surrogate model, would it be worth it compared to simply trying to optimize the $chi^2$ directly? Lastly, would the idea of the inverse problem help at all, i.e. could there be some way of taking the many outputs and 'projecting' them onto the most likely parameters, or is this just another way of stating the same problem?
Extra information: The calculation is not particularly noisy. There are no constraints on the parameters but fitting by hand has already given a good idea of where I should be looking around. I have also identified what I think are the $sim 15$ most important parameters in case it will be too difficult to optimise so many.

Richard · Answer

50 is a lot of parameters. You could try doing a basic first order sensitivity analysis to determine whether you can drop any of these.
Using Bayesian Optimization to minimize a cost function is one way of dealing with the problem you've encountered. But remember that your standard L2 norm might have counterintuitive behaviours in high dimensions (see On the Surprising Behavior of Distance Metrics
in High Dimensional Space).
An alternative is to use Bayesian History Matching. Some good sources for this are Gardner 2019, "Sequential Bayesian History Matching for Model Calibration" and Pievatolo 2018, "Bayes linear uncertainty analysis for oil reservoirs based on multiscale computer experiments".
The idea behind BHM is that you sample parameter space and then train emulators (typically Gaussian Processes) for each of your output parameters so that if the emulator is handed a new parameter set it can predict the output variable given that parameter set.
You can now use the model's actual outputs, the standard deviation of the GPRs, and the GPR predictions as metrics for how likely it is that a given set of parameters will produce a non-implausible model output.
Doing this iteratively shrinks the size of your parameter space, sometimes dramatically. For instance, Andrianakis 2015, "Bayesian History Matching of Complex Infectious Disease Models Using Emulation: A Tutorial and a Case Study on HIV in Uganda" uses these techniques to decrease the non-implausible parameter space of a complex HIV model by a factor of 10^11.
Unfortunately, this tends to work best if you have sufficiently few observations that you can validate your emulators.
(I've done some work on a history matching package here, but I'm afraid it isn't yet at a point where it can be useful to you.)

Nachiket · Answer

Yes, tens of minutes for the model to run is a lot. If you are using a gradient based minimization algorithm such as BFGS to calculate the parameters, you might consider using the adjoint method for computing the gradient very efficiently.

Answered by Nachiket on August 26, 2021

Optimization of expensive model with many parameters

2 Answers

Add your own answers!

Ask a Question