This tutorial will show how to use qNEHVI with fully bayesian inference for multi-objective optimization.

Multi-objective optimization (MOO) covers the case where we care about multiple
outcomes in our experiment but we do not know before hand a specific weighting of those
objectives (covered by `ScalarizedObjective`

) or a specific constraint on one objective
(covered by `OutcomeConstraint`

s) that will produce the best result.

The solution in this case is to find a whole Pareto frontier, a surface in outcome-space containing points that can't be improved on in every outcome. This shows us the tradeoffs between objectives that we can choose to make.

Optimize a list of M objective functions $ \bigl(f^{(1)}( x),..., f^{(M)}( x) \bigr)$ over a bounded search space $\mathcal X \subset \mathbb R^d$.

We assume $f^{(i)}$ are expensive-to-evaluate black-box functions with no known analytical expression, and no observed gradients. For instance, a machine learning model where we're interested in maximizing accuracy and minimizing inference time, with $\mathcal X$ the set of possible configuration spaces

Previous work, has shown that using a fully Bayesian treatment of GP model hyperparameters $\boldsymbol \theta$ can lead to improved closed loop Bayesian optimization performance [1]. Snoek et al [1] propose to use an integrated acquisition function $\alpha_{MCMC}$ where the base acquisition function $\alpha(\mathbf{x} | \boldsymbol \theta, \mathcal D)$ is integrated over the the posterior distribution over the hyperparameters $p({\boldsymbol{\theta}} | \mathcal{D})$, where $ \mathcal{D} = \{{\mathbf{x}}_i, y_i\}_{i=1}^n$:

$\alpha_{MCMC}(\mathbf{x}, \mathcal D) = \int \alpha(\mathbf{x} | \boldsymbol \theta, \mathcal D) p(\boldsymbol \theta | \mathcal D) d\boldsymbol \theta$

Since $p({\boldsymbol{\theta}} | \mathcal{D})$ typically cannot be expressed in closed-form, Markov Chain Monte-Carlo (MCMC) methods are used to draw samples from $p({\boldsymbol{\theta}} | \mathcal{D})$. In this tutorial we use the NUTS sampler from the pyro package for automatic, robust fully Bayesian inference.

[1] J. Snoek, H. Larochelle, R. P. Adams, Practical Bayesian Optimization of Machine Learning Algorithms. Advances in Neural Information Processing Systems 26, 2012.

Recently Eriksson et al [2] propose using sparse axis-aligned subspace priors for Bayesian optimization over high-dimensional search spaces. Specifically, the authors propose using a hierarchical sparsity prior consisting of a global shrinkage parameter with a Half-Cauchy prior $\tau \sim \mathcal{HC}(\beta)$, and ARD lengthscales $\rho_d \sim \mathcal{HC}(\tau)$ for $d=1, ..., D$. See [2] for details.

[2] D. Eriksson, M. Jankowiak. High-Dimensional Bayesian Optimization with Sparse Axis-Aligned Subspaces. Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021.

In this tutorial, we use qNEHVI [3] as our acquisition function for multi-objective optimization. We integrate qNEHVI over the posterior distribution of the GP hyperparameters as proposed in [4].

[3] S. Daulton, M. Balandat, E. Bakshy. Parallel Bayesian Optimization of Multiple Noisy Objectives with Expected Hypervolume Improvement. Arxiv, 2021.

[4] D. Eriksson, P. Chuang, S. Daulton, P. Xia, A. Shrivastava, A. Babu, S. Zhao, A. Aly, G. Venkatesh, M. Balandat. Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization. ICML AutoML Workshop, 2021.

For a deeper explanation of multi-objective optimization, please refer to the dedicated multi-objective optimization tutorial: https://ax.dev/tutorials/multiobjective_optimization.html.

In this tutorial, we use Ax Developer API. Additional resources:

- To learn more about the developer API, refer to the dedicated tutorial: https://ax.dev/tutorials/gpei_hartmann_developer.html.
- To set up a
`GenerationStrategy`

with multi-objective SAASBO (and use it in Ax Service API), follow the generation strategy tutorial: https://ax.dev/tutorials/generation_strategy.html and use`Models.FULLYBAYESIANMOO`

for the Bayesian optimization generation step. - To learn about multi-objective optimization in Ax Service API: https://ax.dev/tutorials/multiobjective_optimization.html#Using-the-Service-API.

In [1]:

```
import pandas as pd
from ax import *
import torch
import numpy as np
from ax.metrics.noisy_function import GenericNoisyFunctionMetric
from ax.service.utils.report_utils import exp_to_df
from ax.runners.synthetic import SyntheticRunner
# Plotting imports and initialization
from ax.utils.notebook.plotting import render, init_notebook_plotting
from ax.plot.contour import plot_contour
from ax.plot.pareto_utils import compute_posterior_pareto_frontier
from ax.plot.pareto_frontier import plot_pareto_frontier
init_notebook_plotting()
# Model registry for creating multi-objective optimization models.
from ax.modelbridge.registry import Models
# Analysis utilities, including a method to evaluate hypervolumes
from ax.modelbridge.modelbridge_utils import observed_hypervolume
```