SPDX-License-Identifier: MIT

Fast Posterior Mean Tutorial

This notebook walks through the fast posterior mean workflow presented in Fast Gaussian Process Posterior Mean Prediction via Local Cross Validation and Precomputation (Dunton et. al 2022) and explains the relevant components of MuyGPyS.

The cell below uses the same code as that found in univariate_regression_tutorial.ipynb. This includes generating the synthetic data from a GP and training two MuyGPs models to fit the data using Bayesian optimization.

[2]:

import matplotlib.pyplot as plt
import numpy as np

from utils import UnivariateSampler

from MuyGPyS._test.gp import benchmark_sample, BenchmarkGP
from MuyGPyS.gp.distortion import IsotropicDistortion, NullDistortion, l2
from MuyGPyS.gp.hyperparameter import ScalarHyperparameter
from MuyGPyS.gp.kernels import Matern
from MuyGPyS.gp.noise import HomoscedasticNoise

np.random.seed(0)

lb = -10.0
ub = 10.0
data_count = 5001
train_step = 10
nugget_var = 1e-14
fixed_length_scale = 1.0
measurement_eps = 1e-3
sampler = UnivariateSampler(
    lb=lb,
    ub=ub,
    data_count=data_count,
    train_step=train_step,
    kernel=Matern(
        nu=ScalarHyperparameter(0.5),
        metric=IsotropicDistortion(
            l2,
            length_scale=ScalarHyperparameter(fixed_length_scale),
        ),
    ),
    eps=HomoscedasticNoise(nugget_var),
    measurement_eps=HomoscedasticNoise(measurement_eps),
)
train_features, test_features = sampler.features()
test_count, _ = test_features.shape
train_count, _ = train_features.shape

train_responses, test_responses = sampler.sample()

sampler.plot_sample()

from MuyGPyS.neighbors import NN_Wrapper
nn_count = 10
nbrs_lookup = NN_Wrapper(train_features, nn_count, nn_method="exact",algorithm="ball_tree")

from MuyGPyS.optimize.batch import sample_batch
batch_count = train_count
batch_indices, batch_nn_indices = sample_batch(
    nbrs_lookup, batch_count, train_count
)

from MuyGPyS.gp import MuyGPS
muygps = MuyGPS(
    kernel=Matern(
        nu=ScalarHyperparameter(0.5),
        metric=IsotropicDistortion(
            metric=l2,
            length_scale=ScalarHyperparameter(
                "log_sample", (0.1, 5.0))
        ),
    ),
    eps=HomoscedasticNoise(measurement_eps),
)

from MuyGPyS.gp.tensors import crosswise_tensor
batch_crosswise_diffs = crosswise_tensor(
    train_features,
    train_features,
    batch_indices,
    batch_nn_indices,
)

from MuyGPyS.gp.tensors import pairwise_tensor
pairwise_diffs = pairwise_tensor(
    train_features, batch_nn_indices
)

Kcross = muygps.kernel(batch_crosswise_diffs)
K = muygps.kernel(pairwise_diffs)

batch_targets = train_responses[batch_indices, :]
batch_nn_targets = train_responses[batch_nn_indices, :]

from MuyGPyS.gp.tensors import make_train_tensors
(
    batch_crosswise_diffs,
    batch_pairwise_diffs,
    batch_targets,
    batch_nn_targets,
) = make_train_tensors(
    batch_indices,
    batch_nn_indices,
    train_features,
    train_responses,
)

from MuyGPyS.optimize import optimize_from_tensors

muygps = optimize_from_tensors(
    muygps,
    batch_targets,
    batch_nn_targets,
    batch_crosswise_diffs,
    batch_pairwise_diffs,
    loss_method="lool",
    obj_method="loo_crossval",
    opt_method="bayesian",
    verbose=False,
    random_state=1,
    init_points=5,
    n_iter=2,
)

from MuyGPyS.optimize.sigma_sq import muygps_sigma_sq_optim

K = muygps.kernel(batch_pairwise_diffs)
muygps = muygps_sigma_sq_optim(muygps, batch_pairwise_diffs, batch_nn_targets, sigma_method="analytic")

Matplotlib is building the font cache; this may take a moment.

../_images/examples_fast_regression_tutorial_4_1.png

Fast Prediction

With set (or learned) hyperparameters, we are able to use the muygps object for fast prediction capability. Several workflows are supported.

See below a fast posterior mean workflow, using the data structures built up in this example. This workflow uses the compact tensor-making function make_fast_predict_tensors() to succinctly create tensors defining the pairwise_diffs among each nearest neighbor and the train_nn_targets_fast or responses of the nearest neighbors in each set. We then create theK covariance tensor and form the precomputed coefficients matrix. We then pass the precomputed coefficients matrix, the updated nn_indices matrix, and the closest neighbor of each test point to MuyGPS.fast_posterior_mean() in order to obtain our predictions.

[3]:

from MuyGPyS.gp.tensors import make_fast_predict_tensors, fast_nn_update, make_predict_tensors
nn_indices,_ = nbrs_lookup.get_nns(train_features)
nn_indices = nn_indices.astype(int)

[4]:

from MuyGPyS.gp.tensors import make_predict_tensors
nn_indices = fast_nn_update(nn_indices)

#find the closest training point to each test point, and its corresponding nearest neighbor set

test_neighbors, _ = nbrs_lookup.get_nns(test_features)
closest_neighbor = test_neighbors[:, 0]
closest_set = nn_indices[closest_neighbor, :].astype(int)

#make crosswise distances tensor for prediction
(
    crosswise_diffs,
    pairwise_diffs,
    nn_targets,
) = make_predict_tensors(
    np.arange(test_count),
    closest_set,
    test_features,
    train_features,
    train_responses,
)

K = muygps.kernel(pairwise_diffs)

precomputed_coefficients_matrix = muygps.fast_coefficients(
K,
nn_targets)



#perform GP regression

Kcross = muygps.kernel(crosswise_diffs)
fast_predictions = muygps.fast_posterior_mean(
    Kcross,
    precomputed_coefficients_matrix[closest_neighbor,:,:])

Regular Prediction

With set (or learned) hyperparameters, we are able to use the muygps object to predict the response of test data. Several workflows are supported.

See below a simple posterior mean workflow, using the data structures built up in this example. This workflow uses the compact tensor-making function make_predict_tensors() to succinctly create tensors defining the pairwise_diffs among each nearest neighbor set, the crosswise_diffs between each test point and its nearest neighbor set, and the nn_targets or responses of the nearest neighbors in each set. We then create the Kcross cross-covariance matrix and K covariance tensor and pass them to MuyGPS.posterior_mean() in order to obtain our predictions.

[5]:

from MuyGPyS.gp.tensors import make_predict_tensors

# make the indices
test_count, _ = test_features.shape
indices = np.arange(test_count)
nn_indices, _ = nbrs_lookup.get_nns(test_features)

# make difference and target tensors
(
    crosswise_diffs,
    pairwise_diffs,
    nn_targets,
) = make_predict_tensors(
    indices,
    nn_indices,
    test_features,
    train_features,
    train_responses,
)

# Make the kernel
Kcross = muygps.kernel(crosswise_diffs)
K = muygps.kernel(pairwise_diffs)

# perform Gaussian process regression

predictions = muygps.posterior_mean(
    K, Kcross, nn_targets
)

Timing Experiment

We compare the prediction time of a regular posterior mean workflow to that of the fast posterior mean workflow. In the regular posterior mean workflow we compute the sum of the time it takes to identify the nearest neighbors of the test features, the time it takes to form the relevant kernel tensors, and the time to solve the posterior means. In the fast posterior mean case, we compute the sum of the time it takes to identify the nearest neighbor of each test point, the coefficient lookup in the precomputed coefficient matrix, and the dot product to form posterior means.

[6]:

from MuyGPyS.optimize.loss import mse_fn
import timeit


test_count, _ = test_features.shape
indices = np.arange(test_count)



def timing_posterior_mean():
    nn_indices, _ = nbrs_lookup.get_nns(test_features)
    (
        crosswise_diffs,
        pairwise_diffs,
        nn_targets,
    ) = make_predict_tensors(
        indices,
        nn_indices,
        test_features,
        train_features,
        train_responses,
    )

    Kcross = muygps.kernel(crosswise_diffs)
    K = muygps.kernel(pairwise_diffs)
    predictions = muygps.posterior_mean(
        K, Kcross, nn_targets
    )

print(f"regular RMSE:")
print(f"\tRMSE: {np.sqrt(mse_fn(predictions, test_responses))}")
print("regular prediction time:")
%timeit timing_posterior_mean()



nn_indices = fast_nn_update(nn_indices)
def timing_fast_posterior_mean():
    test_neighbors, _ = nbrs_lookup.get_nns(test_features)
    closest_neighbor = test_neighbors[:, 0]
    closest_set = nn_indices[closest_neighbor, :].astype(int)

    #make crosswise distances tensor
    (
        crosswise_diffs,
        pairwise_diffs,
        nn_targets,
    ) = make_predict_tensors(
        np.arange(test_count),
        closest_set,
        test_features,
        train_features,
        train_responses,
    )



    Kcross = muygps.kernel(crosswise_diffs)
    fast_predictions = muygps.fast_posterior_mean(
        Kcross,
        precomputed_coefficients_matrix[closest_neighbor,:,:])



print(f"fast prediction RMSE:")
print(f"\tRMSE: {np.sqrt(mse_fn(fast_predictions, test_responses))}")
print("fast prediction time:")
%timeit timing_fast_posterior_mean()

regular RMSE:
        RMSE: 0.1206710068099257
regular prediction time:
64.7 ms ± 1.92 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
fast prediction RMSE:
        RMSE: 1.658329619936535
fast prediction time:
14.8 ms ± 101 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Results

We achieve roughly two orders of magnitude speedup using the fast prediction acceleration. The improvement is even more dramatic when the methods are implemented in JAX.