Fig. 1: Overview of model calibration framework by Bayesian optimization, acquisition function, and Gaussian process and machine learning emulators. | Nature Communications

Fig. 1: Overview of model calibration framework by Bayesian optimization, acquisition function, and Gaussian process and machine learning emulators.

From: Emulator-based Bayesian optimization for efficient multi-objective calibration of an individual-based model of malaria

Fig. 1: Overview of model calibration framework by Bayesian optimization, acquisition function, and Gaussian process and machine learning emulators.The alternative text for this image may have been generated using AI.

a General framework. The input parameter space is initially sampled in a space-filling manner, generating the initial core parameter sets (initialization). For each candidate set, simulations are performed with the model, mirroring the studies that yielded the calibration data. The deviation between simulation results and data is assessed, yielding goodness of fit scores for each parameter set. An emulator (c or d) is trained to capture the relationship between parameter sets and goodness of fit and used to generate out-of-sample predictions. Based on these, the most promising additional parameter sets are chosen (adaptive sampling by means of an acquisition function), evaluated, and added to the training set of simulations. Training and adaptive sampling are repeated until the emulator converges and a decision on the parameter set yielding the best fit is made. b Acquisition function. The acquisition function (black line) is used to determine new parameter space locations, \({{{{{\boldsymbol{\theta }}}}}}\).\(\,{{{{{\boldsymbol{\theta }}}}}}\) is a vector of input parameters (23-dimensional for the model described here) to be evaluated during adaptive sampling (blue dot for previously evaluated locations, orange dot for new locations to be evaluated in the current iteration). It incorporates both predictive uncertainty (blue shading) of the emulator and proximity to the minimum. c Gaussian process (GP) emulator. A heteroscedastic Gaussian process is used to generate predictions on the loss functions, \({\hat{{{{{{\boldsymbol{f}}}}}}}}_{{{\mathrm {GP}}}}({{{{{\boldsymbol{\theta }}}}}})\), for each input parameter set \({{{{{\boldsymbol{\theta }}}}}}\). d Gaussian process stacked generalization (GPSG) emulator. Three machine learning algorithms (level 0 learners: bilayer neural net, multivariate adaptive regression splines and random forest) are used to generate predictions on the individual objective loss functions \({\hat{{{{{{\boldsymbol{f}}}}}}}}_{{{\mathrm {NN}}}},{\hat{{{{{{\boldsymbol{f}}}}}}}}_{{\mathrm {M}}}\,\) and \({\hat{{{{{{\boldsymbol{f}}}}}}}}_{{{\mathrm {RF}}}}\) (collectively \({\hat{{{{{{\boldsymbol{f}}}}}}}}_{{{\mathrm {ML}}}}\)) at locations \({{{{{\boldsymbol{\theta }}}}}}\). These predictions are inputs to a heteroscedastic (level 1 learner) which is used to generate the stacked learner predictions \({\hat{{{{{{\boldsymbol{f}}}}}}}}_{{{\mathrm {GPSG}}}}\) and derive predictions on the overall goodness of fit \({\hat{{{{{{\boldsymbol{F}}}}}}}}_{{{\mathrm {GPSG}}}}\).

Back to article page