MuyGPS

class MuyGPyS.gp.muygps.MuyGPS(kern='matern', eps={'val': 0.0}, **kwargs)[source]

Local Kriging Gaussian Process.

Performs approximate GP inference by locally approximating an observation’s response using its nearest neighbors. Implements the MuyGPs algorithm as articulated in [muyskens2021muygps].

Kernels accept different hyperparameter dictionaries specifying hyperparameter settings. Keys can include val and bounds. bounds must be either a len == 2 iterable container whose elements are scalars in increasing order, or the string fixed. If bounds == fixed (the default behavior), the hyperparameter value will remain fixed during optimization. val must be either a scalar (within the range of the upper and lower bounds if given) or the strings "sample" or log_sample", which will randomly sample a value within the range given by the bounds.

In addition to individual kernel hyperparamters, each MuyGPS object also possesses a homoscedastic \(\varepsilon\) noise parameter and a vector of \(\sigma^2\) indicating the scale parameter associated with the posterior variance of each dimension of the response.

\(\sigma^2\) is the only parameter assumed to be a training target by default, and is treated differently from all other hyperparameters. All other training targets must be manually specified in k_kwargs.

Example

>>> from MuyGPyS.gp.muygps import MuyGPS
>>> k_kwargs = {
...         "kern": "rbf",
...         "metric": "F2",
...         "eps": {"val": 1e-5},
...         "nu": {"val": 0.38, "bounds": (0.1, 2.5)},
...         "length_scale": {"val": 7.2},
... }
>>> muygps = MuyGPS(**k_kwarg)

MuyGPyS depends upon linear operations on specially-constructed tensors in order to efficiently estimate GP realizations. One can use (see their documentation for details) MuyGPyS.gp.distance.pairwise_distances() to construct pairwise distance tensors and MuyGPyS.gp.distance.crosswise_distances() to produce crosswise distance matrices that MuyGPS can then use to construct kernel tensors and cross-covariance matrices, respectively.

We can easily realize kernel tensors using a MuyGPS object’s kernel functor once we have computed a pairwise_dists tensor and a crosswise_dists matrix.

Example

>>> K = muygps.kernel(pairwise_dists)
>>> Kcross = muygps.kernel(crosswise_dists)
Parameters
  • kern (str) – The kernel to be used. Each kernel supports different hyperparameters that can be specified in kwargs. Currently supports only matern and rbf.

  • eps (Dict[str, Union[float, Tuple[float, float]]]) – A hyperparameter dict.

  • kwargs – Addition parameters to be passed to the kernel, possibly including additional hyperparameter dicts and a metric keyword.

build_fast_regress_coeffs(train, nn_indices, targets, indices_by_rank=False)[source]

Produces coefficient matrix for fast regression given in Equation (8) of [dunton2022fast]. To form each row of this matrix, we compute

\[\mathbf{C}_{N^*}(i, :) = (K_{\hat{\theta}} (X_{N^*}, X_{N^*}) + \varepsilon I_k)^{-1} Y(X_{N^*}).\]

Here \(X_{N^*}\) is the union of the nearest neighbor of the ith test point and the nn_count - 1 nearest neighbors of this nearest neighbor, \(K_{\hat{\theta}}\) is the trained kernel functor specified by self.kernel, \(\varepsilon I_k\) is a diagonal homoscedastic noise matrix whose diagonal is the value of the self.eps hyperparameter, and \(Y(X_{N^*})\) is the (train_count,) vector of responses corresponding to the training features indexed by $N^*$.

Parameters
  • train (ndarray) – The full training data matrix of shape (train_count, feature_count).

  • nn_indices (ndarray) – The nearest neighbors indices of each training points of shape (train_count, nn_count).

  • targets (ndarray) – A matrix of shape (train_count, response_count) whose rows are vector-valued responses for each training element.

Return type

ndarray

Returns

A matrix of shape (train_count, nn_count) whose rows are the precomputed coefficients for fast regression.

fast_regress(Kcross, coeffs_mat)[source]

Performs fast regression using provided cross-covariance and precomputed coefficient matrix.

Assumes that cross-covariance matrix Kcross is already computed and given as an argument. To implicitly construct these values from indices instead use fast_regress_from_indices().

Returns the predicted response in the form of a posterior mean for each element of the batch of observations, as computed in Equation (9) of [dunton2022fast]. For each test point \(\mathbf{z}\), we compute

\[\widehat{Y} (\mathbf{z} \mid X) = K_\theta (\mathbf{z}, X_{N^*}) \mathbf{C}_{N^*}.\]

Here \(X_{N^*}\) is the union of the nearest neighbor of the queried test point \(\mathbf{z}\) and the nearest neighbors of that training point, \(K_\theta\) is the kernel functor specified by self.kernel, and \(\mathbf{C}_{N^*}\) is the matrix of precomputed coefficients given in Equation (8) of [dunton2022fast].

Parameters
  • Kcross (ndarray) – A matrix of shape (batch_count, nn_count) containing the 1 x nn_count -shaped cross-covariance vector corresponding to each of the batch elements.

  • coeffs_mat (ndarray) – A matrix of shape (batch_count, nn_count) whose rows are given by precomputed coefficients for fast regression.

Return type

ndarray

Returns

A matrix of shape (batch_count, response_count) whose rows are the predicted response for each of the given indices.

fast_regress_from_indices(indices, nn_indices, test_features, train_features, closest_index, coeffs_mat)[source]

Performs fast regression using provided cross-covariance, the index of the training point closest to the queried test point, and precomputed coefficient matrix.

Returns the predicted response in the form of a posterior mean for each element of the batch of observations, as computed in Equation (9) of [dunton2022fast]. For each test point \(\mathbf{z}\), we compute

\[\widehat{Y} (\mathbf{z} \mid X) = K_\theta (\mathbf{z}, X_{N^*}) \mathbf{C}_{N^*}.\]

Here \(X_{N^*}\) is the union of the nearest neighbor of the queried test point \(\mathbf{z}\) and the nearest neighbors of that training point, \(K_\theta\) is the kernel functor specified by self.kernel, and \(\mathbf{C}_{N^*}\) is the matrix of precomputed coefficients given in Equation (8) of [dunton2022fast].

Parameters
  • indices (ndarray) – A vector of shape ('batch_count,) providing the indices of the test features to be queried in the formation of the crosswise distance tensor.

  • nn_indices (ndarray) – A matrix of shape ('batch_count, nn_count) providing the index of the closest training point to each queried test point, as well as the nn_count - 1 closest neighbors of that point.

  • test_features (ndarray) – A matrix of shape (batch_count, feature_count) containing the test data points.

  • train_features (ndarray) – A matrix of shape (train_count, feature_count) containing the training data.

  • closest_index (ndarray) – A vector of shape ('batch_count,) for which each entry is the index of the training point closest to each queried point.

  • coeffs_mat (ndarray) – A matrix of shape ('batch_count, nn_count) providing precomputed coefficients for fast regression.

Return type

ndarray

Returns

A matrix of shape (batch_count,) whose rows are the predicted response for each of the given indices.

fixed()[source]

Checks whether all kernel and model parameters are fixed.

This is a convenience utility to determine whether optimization is required.

Return type

bool

Returns

Returns True if all parameters are fixed, and False otherwise.

get_array_opt_mean_fn()[source]

Return a posterior mean function for use in optimization.

This function is designed for use with MuyGPyS.optimize.chassis.optimize_from_tensors() with opt_method="scipy", and assumes that the optimization parameters will be passed in an (optim_count,) vector where eps is either the last element or is not included.

Return type

Callable

Returns

A function implementing posterior mean, where eps is either fixed or takes updating values during optimization. The function expects a list of current hyperparameter values for unfixed parameters, which are expected to occur in a certain order matching how they are set in ~MuyGPyS.gp.muygps.MuyGPS.get_optim_params().

get_kwargs_opt_mean_fn()[source]

Return a posterior mean function for use in optimization.

This function is designed for use with MuyGPyS.optimize.chassis.optimize_from_tensors() with opt_method="bayesian", and assumes that either eps will be passed via a keyword argument or not at all.

Return type

Callable

Returns

A function implementing the posterior mean, where eps is either fixed or takes updating values during optimization. The function expects keyword arguments corresponding to current hyperparameter values for unfixed parameters.

get_opt_mean_fn(opt_method)[source]

Return a posterior mean function for use in optimization.

This function is designed for use with MuyGPyS.optimize.chassis.optimize_from_tensors(). The opt_method parameter determines the format of the returned function.

Return type

Callable

Returns

A function implementing a posterior mean, where eps is either fixed or takes updating values during optimization. The format of the function depends upon opt_method.

get_opt_var_fn(opt_method)[source]

Return a posterior variance function for use in optimization.

This function is designed for use with MuyGPyS.optimize.chassis.optimize_from_tensors(). The opt_method parameter determines the format of the returned function.

Return type

Callable

Returns

A function implementing posterior variance, where eps is either fixed or takes updating values during optimization. The format of the function depends upon opt_method.

get_optim_params()[source]

Return lists of unfixed hyperparameter names, values, and bounds.

Return type

Tuple[List[str], ndarray, ndarray]

Returns

  • names – A list of unfixed hyperparameter names.

  • params – A list of unfixed hyperparameter values.

  • bounds – A list of unfixed hyperparameter bound tuples.

regress(K, Kcross, batch_nn_targets, variance_mode=None, apply_sigma_sq=True)[source]

Performs simultaneous regression on provided covariance, cross-covariance, and target.

Computes parallelized local solves of systems of linear equations using the last two dimensions of K along with Kcross and batch_nn_targets to predict responses in terms of the posterior mean. Also computes the posterior variance if variance_mode is set appropriately. Assumes that kernel tensor K and cross-covariance matrix Kcross are already computed and given as arguments. To implicitly construct these values from indices (useful if the kernel or distance tensors and matrices are not needed for later reference) instead use regress_from_indices().

Returns the predicted response in the form of a posterior mean for each element of the batch of observations, as computed in Equation (3.4) of [muyskens2021muygps]. For each batch element \(\mathbf{x}_i\), we compute

\[\widehat{Y}_{NN} (\mathbf{x}_i \mid X_{N_i}) = K_\theta (\mathbf{x}_i, X_{N_i}) (K_\theta (X_{N_i}, X_{N_i}) + \varepsilon I_k)^{-1} Y(X_{N_i}).\]

Here \(X_{N_i}\) is the set of nearest neighbors of \(\mathbf{x}_i\) in the training data, \(K_\theta\) is the kernel functor specified by self.kernel, \(\varepsilon I_k\) is a diagonal homoscedastic noise matrix whose diagonal is the value of the self.eps hyperparameter, and \(Y(X_{N_i})\) is the (nn_count, respones_count) matrix of responses of the nearest neighbors given by the second two dimensions of the batch_nn_targets argument.

If variance_mode == "diagonal", also return the local posterior variances of each prediction, corresponding to the diagonal elements of a covariance matrix. For each batch element \(\mathbf{x}_i\), we compute

\[Var(\widehat{Y}_{NN} (\mathbf{x}_i \mid X_{N_i})) = K_\theta (\mathbf{x}_i, \mathbf{x}_i) - K_\theta (\mathbf{x}_i, X_{N_i}) (K_\theta (X_{N_i}, X_{N_i}) + \varepsilon I_k)^{-1} K_\theta (X_{N_i}, \mathbf{x}_i).\]
Parameters
  • K (array) – A tensor of shape (batch_count, nn_count, nn_count) containing the (nn_count, nn_count -shaped kernel matrices corresponding to each of the batch elements.

  • Kcross (array) – A matrix of shape (batch_count, nn_count) containing the 1 x nn_count -shaped cross-covariance matrix corresponding to each of the batch elements.

  • batch_nn_targets (array) – A tensor of shape (batch_count, nn_count, response_count) whose last dimension lists the vector-valued responses for the nearest neighbors of each batch element.

  • variance_mode (Optional[str]) – Specifies the type of variance to return. Currently supports "diagonal" and None. If None, report no variance term.

  • apply_sigma_sq (bool) – Indicates whether to scale the posterior variance by sigma_sq. Unused if variance_mode is None or sigma_sq.trained() is False.

Return type

Union[ndarray, Tuple[ndarray, ndarray]]

Returns

  • responses – A matrix of shape (batch_count, response_count) whose rows are the predicted response for each of the given indices.

  • diagonal_variance – A vector of shape (batch_count,) consisting of the diagonal elements of the posterior variance, or a matrix of shape (batch_count, response_count) for a multidimensional response. Only returned where variance_mode == "diagonal".

regress_from_indices(indices, nn_indices, test, train, targets, variance_mode=None, apply_sigma_sq=True, return_distances=False, indices_by_rank=False)[source]

Performs simultaneous regression on a list of observations.

This is similar to the old regress API in that it implicitly creates and discards the distance and kernel tensors and matrices. If these data structures are needed for later reference, instead use regress().

Parameters
  • indices (ndarray) – An integral vector of shape (batch_count,) indices of the observations to be approximated.

  • nn_indices (ndarray) – An integral matrix of shape (batch_count, nn_count) listing the nearest neighbor indices for all observations in the test batch.

  • test (ndarray) – The full testing data matrix of shape (test_count, feature_count).

  • train (ndarray) – The full training data matrix of shape (train_count, feature_count).

  • targets (ndarray) – A matrix of shape (train_count, response_count) whose rows are vector-valued responses for each training element.

  • variance_mode (Optional[str]) – Specifies the type of variance to return. Currently supports "diagonal" and None. If None, report no variance term.

  • apply_sigma_sq (bool) – Indicates whether to scale the posterior variance by sigma_sq. Unused if variance_mode is None or sigma_sq.trained() is False.

  • return_distances (bool) – If True, returns a (test_count, nn_count) matrix containing the crosswise distances between the test elements and their nearest neighbor sets and a (test_count, nn_count, nn_count) tensor containing the pairwise distances between the test data’s nearest neighbor sets.

  • indices_by_rank (bool) – If True, construct the tensors using local indices with no communication. Only for use in MPI mode.

Return type

Union[ndarray, Tuple[ndarray, ndarray], Tuple[ndarray, ndarray, ndarray], Tuple[ndarray, ndarray, ndarray, ndarray]]

Returns

  • responses – A matrix of shape (batch_count, response_count) whose rows are the predicted response for each of the given indices.

  • diagonal_variance – A vector of shape (batch_count,) consisting of the diagonal elements of the posterior variance, or a matrix of shape (batch_count, response_count) for a multidimensional response. Only returned where variance_mode == "diagonal".

  • crosswise_dists – A matrix of shape (test_count, nn_count) whose rows list the distance of the corresponding test element to each of its nearest neighbors. Only returned if return_distances is True.

  • pairwise_dists – A tensor of shape (test_count, nn_count, nn_count) whose latter two dimensions contain square matrices containing the pairwise distances between the nearest neighbors of the test elements. Only returned if return_distances is True.

set_eps(**eps)[source]

Reset \(\varepsilon\) value or bounds.

Uses existing value and bounds as defaults.

Parameters

eps – A hyperparameter dict.

Return type

None