MultivariateMuyGPS

class MuyGPyS.gp.multivariate_muygps.MultivariateMuyGPS(*model_args)[source]

Multivariate Local Kriging Gaussian Process.

Performs approximate GP inference by locally approximating an observation’s response using its nearest neighbors with a separate kernel allocated for each response dimension, implemented as individual MuyGPyS.gp.muygps.MuyGPS objects.

This class is similar in interface to MuyGPyS.gp.muygps.MuyGPS, but requires a list of hyperparameter dicts at initialization.

Example

>>> from MuyGPyS.gp import MultivariateMuyGPS as MMuyGPS
>>> k_kwargs1 = {
...     "noise": Parameter(1e-5),
...     "kernel": Matern(
...         nu=Parameter(0.67, (0.1, 2.5)),
...         deformation=Isotropy(
...             metric=l2,
...             length_scale=Parameter(0.2),
...         scale=AnalyticScale(),
...     ),
... }
>>> k_kwargs2 = {
...     "noise": Parameter(1e-5),
...     "kernel": Matern(
...         nu=Parameter(0.67, (0.1, 2.5)),
...         deformation=Isotropy(
...             metric=l2,
...             length_scale=Parameter(0.2),
...         scale=AnalyticScale(),
...     ),
... }
>>> k_args = [k_kwargs1, k_kwargs2]
>>> mmuygps = MMuyGPS(*k_args)

We can realize kernel tensors for each of the models contained within a MultivariateMuyGPS object by iterating over its models member. Once we have computed pairwise_diffs and crosswise_diffs tensors, it is straightforward to perform each of these realizations.

Example

>>> for model in MuyGPyS.models:
>>>     K = model.kernel(pairwise_diffs)
>>>     Kcross = model.kernel(crosswise_diffs)
>>>     # do something with K and Kcross...

Args

model_args:: Dictionaries defining each internal MuyGPyS.gp.muygps.MuyGPS instance.

fast_coefficients(pairwise_diffs_fast, train_nn_targets_fast)[source]

Produces coefficient tensor for fast posterior mean inference given in Equation (8) of [dunton2022fast].

To form the tensor, we compute

\[\mathbf{C}_{N^*}(i, :, j) = (K_{\hat{\theta_j}} (X_{N^*}, X_{N^*}) + \varepsilon_j)^{-1} Y(X_{N^*}).\]

Here $X_{N^*}$ is the union of the nearest neighbor of the ith test point and the nn_count - 1 nearest neighbors of this nearest neighbor, $K_{\hat{\theta_j}}$ is the trained kernel functor corresponding the jth response and specified by self.models, $\varepsilon_j$ is a diagonal noise matrix whose diagonal elements are informed by the self.noise hyperparameter, and $Y(X_{N^*})$ is the (train_count, response_count) matrix of responses corresponding to the training features indexed by $N^*$.

Parameters:

pairwise_diffs – A tensor of shape (train_count, nn_count, nn_count, feature_count) containing the (nn_count, nn_count, feature_count)-shaped pairwise nearest neighbor difference tensors corresponding to each of the batch elements.
batch_nn_targets – A tensor of shape (train_count, nn_count, response_count) listing the vector-valued responses for the nearest neighbors of each batch element.

Return type:

ndarray

Returns:

A tensor of shape (batch_count, nn_count, response_count) whose entries comprise the precomputed coefficients for fast posterior mean inference.

fast_posterior_mean(crosswise_diffs, coeffs_tensor)[source]

Performs fast posterior mean inference using provided crosswise differences and precomputed coefficient matrix.

Returns the predicted response in the form of a posterior mean for each element of the batch of observations, as computed in Equation (9) of [dunton2022fast]. For each test point $\mathbf{z}$, we compute

\[\widehat{Y} (\mathbf{z} \mid X) = K_\theta (\mathbf{z}, X_{N^*}) \mathbf{C}_{N^*}.\]

Here $X_{N^*}$ is the union of the nearest neighbor of the queried test point $\mathbf{z}$ and the nearest neighbors of that training point, $K_\theta$ is the kernel functor specified by self.kernel, and $\mathbf{C}_{N^*}$ is the matrix of precomputed coefficients given in Equation (8) of [dunton2022fast].

Parameters:

crosswise_diffs (ndarray) – A matrix of shape (batch_count, nn_count, feature_count) whose rows list the difference between each feature of each batch element element and its nearest neighbors.
coeffs_tensor (ndarray) – A tensor of shape (batch_count, nn_count, response_count) providing the precomputed coefficients.

Return type:

ndarray

Returns:

A matrix of shape (batch_count, response_count) whose rows are the predicted response for each of the given indices.

fixed()[source]

Checks whether all kernel and model parameters are fixed for each model, excluding $\sigma^2$.

Return type:: bool
Returns:: Returns True if all parameters in all models are fixed, and False otherwise.

optimize_scale(pairwise_diffs, nn_targets)[source]

Optimize the value of the $\sigma^2$ scale parameter for each response dimension.

We approximate $\sigma^2$ by way of averaging over the analytic solution from each local kernel.

\[\sigma^2 = \frac{1}{bk} * \sum_{i \in B} Y_{nn_i}^T K_{nn_i}^{-1} Y_{nn_i}\]

Here $Y_{nn_i}$ and $K_{nn_i}$ are the target and kernel matrices with respect to the nearest neighbor set in scope, where $k$ is the number of nearest neighbors and $b = |B|$ is the number of batch elements considered.

Parameters:

muygps – The model to be optimized.
pairwise_diffs (ndarray) – A tensor of shape (batch_count, nn_count, nn_count, feature_count) containing the (nn_count, nn_count, feature_count)-shaped pairwise nearest neighbor difference tensors corresponding to each of the batch elements.
nn_targets (ndarray) – Tensor of floats of shape (batch_count, nn_count, response_count) containing the expected response for each nearest neighbor of each batch element.

Returns:

The MultivariateMuyGPs model whose scale parameter (and those of its submodels) has been optimized.

posterior_mean(pairwise_diffs, crosswise_diffs, batch_nn_targets)[source]

Performs simultaneous posterior mean inference on provided difference tensors and the target matrix.

Computes parallelized local solves of systems of linear equations using the kernel realizations, one for each internal model, of the last two dimensions of pairwise_diffs along with crosswise_diffs and batch_nn_targets to predict responses in terms of the posterior mean. Assumes that difference tensors pairwise_diffs and crosswise_diffs are already computed and given as arguments.

Returns the predicted response in the form of a posterior mean for each element of the batch of observations by solving a system of linear equations induced by each kernel functor, one per response dimension, in a generalization of Equation (3.4) of [muyskens2021muygps]. For each batch element $\mathbf{x}_i$ we compute

\[\widehat{Y}_{NN} (\mathbf{x}_i \mid X_{N_i})_{:,j} = K^{(j)}_\theta (\mathbf{x}_i, X_{N_i}) (K^{(j)}_\theta (X_{N_i}, X_{N_i}) + \varepsilon_j)^{-1} Y(X_{N_i})_{:,j}.\]

Here $X_{N_i}$ is the set of nearest neighbors of $\mathbf{x}_i$ in the training data, $K^{(j)}_\theta$ is the kernel functor associated with the jth internal model, corresponding to the jth response dimension, $\varepsilon_j$ is a diagonal noise matrix whose diagonal elements are informed by the value of the self.models[j].noise hyperparameter, and $Y(X_{N_i})_{:,j}$ is the (batch_count,) vector of the jth responses of the nearest neighbors given by a slice of the batch_nn_targets argument.

Parameters:

pairwise_diffs (ndarray) – A tensor of shape (batch_count, nn_count, nn_count, feature_count) containing the (nn_count, nn_count, feature_count)-shaped pairwise nearest neighbor difference tensors corresponding to each of the batch elements.
crosswise_diffs (ndarray) – A matrix of shape (batch_count, nn_count, feature_count) whose rows list the difference between each feature of each batch element element and its nearest neighbors.
batch_nn_targets (ndarray) – A tensor of shape (batch_count, nn_count, response_count) listing the vector-valued responses for the nearest neighbors of each batch element.

Return type:

ndarray

Returns:

A matrix of shape (batch_count, response_count) whose rows are the predicted response for each of the given indices.

posterior_variance(pairwise_diffs, crosswise_diffs)[source]

Performs simultaneous posterior variance inference on provided difference tensors.

Return the local posterior variances of each prediction, corresponding to the diagonal elements of a covariance matrix. For each batch element $\mathbf{x}_i$, we compute

\[Var(\widehat{Y}_{NN} (\mathbf{x}_i \mid X_{N_i}))_j = K^{(j)}_\theta (\mathbf{x}_i, \mathbf{x}_i) - K^{(j)}_\theta (\mathbf{x}_i, X_{N_i}) (K^{(j)}_\theta (X_{N_i}, X_{N_i}) + \varepsilon_j)^{-1} K^{(j)}_\theta (X_{N_i}, \mathbf{x}_i).\]

Parameters:

pairwise_diffs (ndarray) – A tensor of shape (batch_count, nn_count, nn_count, feature_count) containing the (nn_count, nn_count, feature_count)-shaped pairwise nearest neighbor difference tensors corresponding to each of the batch elements.
crosswise_diffs (ndarray) – A matrix of shape (batch_count, nn_count, feature_count) whose rows list the difference between each feature of each batch element element and its nearest neighbors.

Return type:

ndarray

Returns:

A vector of shape (batch_count, response_count) consisting of the diagonal elements of the posterior variance for each model.