fast_regress¶
Resources and high-level API for a fast regression workflow.
make_fast_regressor()
is a high-level API for
creating the necessary components for fast regression.
make_fast_multivariate_regressor()
is a high-level
API for creating the necessary components for fast regression with multiple
outputs.
do_fast_regress()
is a high-level api
for executing a simple, generic regression workflow given data.
It calls the maker APIs above and
fast_regress_any()
.
- MuyGPyS.examples.fast_regress.do_fast_regress(test_features, train_features, train_targets, nn_count=30, batch_count=200, loss_method='lool', obj_method='loo_crossval', opt_method='bayes', sigma_method='analytic', kern=None, k_kwargs={}, nn_kwargs={}, opt_kwargs={}, verbose=False)[source]¶
Convenience function initializing a model and performing regression.
Expected parameters include keyword argument dicts specifying kernel parameters and nearest neighbor parameters. See the docstrings of the appropriate functions for specifics.
Also supports workflows relying upon multivariate models. In order to create a multivariate model, specify the
kern
argument and pass a list of hyperparameter dicts tok_kwargs
.Example
>>> from MuyGPyS.testing.test_utils import _make_gaussian_data >>> from MuyGPyS.examples.fast_regress import do_fast_regress >>> from MuyGPyS.optimize.objective import mse_fn >>> train, test = _make_gaussian_data(10000, 1000, 100, 10) >>> nn_kwargs = {"nn_method": "exact", "algorithm": "ball_tree"} >>> k_kwargs = { ... "kern": "rbf", ... "metric": "F2", ... "eps": {"val": 1e-5}, ... "length_scale": {"val": 1.0, "bounds": (1e-2, 1e2)} ... } >>> muygps, nbrs_lookup, predictions, precomputed_coefficients_matrix ... = do_fast_regress( ... test['input'], ... train['input'], ... train['output'], ... nn_count=30, ... batch_count=200, ... loss_method="mse", ... obj_method="loo_crossval", ... opt_method="bayes", ... k_kwargs=k_kwargs, ... nn_kwargs=nn_kwargs, ... verbose=False, ... )
- Parameters
test_features (
ndarray
) – A matrix of shape(test_count, feature_count)
whose rows consist of observation vectors of the test data.train_features (
ndarray
) – A matrix of shape(train_count, feature_count)
whose rows consist of observation vectors of the train data.train_targets (
ndarray
) – A matrix of shape(train_count, response_count)
whose rows consist of response vectors of the train data.nn_count (
int
) – The number of nearest neighbors to employ.batch_count (
int
) – The number of elements to sample batch for hyperparameter optimization.loss_method (
str
) – The loss method to use in hyperparameter optimization. Ignored if all of the parameters specified by argumentk_kwargs
are fixed. Currently supports only"mse"
for regression.obj_method (
str
) – Indicates the objective function to be minimized. Currently restricted to"loo_crossval"
.opt_method (
str
) – Indicates the optimization method to be used. Currently restricted to"bayesian"
and"scipy"
.sigma_method (
Optional
[str
]) – The optimization method to be employed to learn thesigma_sq
hyperparameter. Currently supports only"analytic"
andNone
. If the value is notNone
, the returnedMuyGPyS.gp.muygps.MuyGPS
object will possess asigma_sq
member whose value, invoked viamuygps.sigma_sq()
, is a(response_count,)
vector to be used for scaling posterior variances.kern (
Optional
[str
]) – The kernel function to be used. See kernels for details. Only used in the multivariate case. IfNone
, assume that we are not using a multivariate model.k_kwargs (
Union
[Dict
,List
[Dict
],Tuple
[Dict
, …]]) – If given a list or tuple of lengthresponse_count
, assume that the elements are dicts containing kernel initialization keyword arguments for the creation of a multivariate model (seemake_multivariate_regressor()
). If given a dict, assume that the elements are keyword arguments to a MuyGPs model (seemake_regressor()
).nn_kwargs (
Dict
) – Parameters for the nearest neighbors wrapper. SeeMuyGPyS.neighbors.NN_Wrapper
for the supported methods and their parameters.opt_kwargs (
Dict
) – Parameters for the wrapped optimizer. See the docs of the corresponding library for supported parameters.verbose (
bool
) – IfTrue
, print summary statistics.
- Return type
Tuple
[ndarray
,ndarray
,ndarray
,ndarray
,Dict
]- Returns
muygps – A (possibly trained) MuyGPs object.
nbrs_lookup – A data structure supporting nearest neighbor queries into
train_features
.predictions – The predicted response associated with each test observation.
precomputed_coefficients_matrix – A matrix of shape
(train_count, nn_count)
whose rows list the precomputed coefficients for each nearest neighbors set in the training data.timing – A dictionary containing timings for the training, precomputation, nearest neighbor computation, and prediction.
- MuyGPyS.examples.fast_regress.fast_regress_any(muygps, test_features, train_features, nbrs_lookup, train_targets)[source]¶
Convenience function performing regression using a pre-trained model.
Also supports workflows relying upon multivariate models.
- Parameters
muygps (
Union
[MuyGPS
,MultivariateMuyGPS
]) – A (possibly trained) MuyGPS object.test_features (
ndarray
) – A matrix of shape(test_count, feature_count)
whose rows consist of observation vectors of the test data.train_features (
ndarray
) – A matrix of shape(train_count, feature_count)
whose rows consist of observation vectors of the train data.nbrs_lookup (
NN_Wrapper
) – A data structure supporting nearest neighbor queries intotrain_features
.train_targets (
ndarray
) – A matrix of shape(train_count, response_count)
whose rows consist of response vectors of the train data.
- Return type
Tuple
[ndarray
,ndarray
,Dict
]- Returns
predictions – The predicted response associated with each test observation.
precomputed_coefficients_matrix – A matrix of shape
(train_count, nn_count)
whose rows list the precomputed coefficients for each nearest neighbors set in the training data.timing – A dictionary containing timings for the training, precomputation, nearest neighbor computation, and prediction.
- MuyGPyS.examples.fast_regress.make_fast_multivariate_regressor(muygps, nbrs_lookup, train_features, train_targets)[source]¶
Convenience function for creating precomputed coefficient matrix and neighbor lookup data structure.
- Parameters
muygps (
MultivariateMuyGPS
) – A trained MultivariateMuyGPS object.nbrs_lookup (
NN_Wrapper
) – A data structure supporting nearest neighbor queries intotrain_features
.train_features (
ndarray
) – A matrix of shape(train_count, feature_count)
whose rows consist of observation vectors of the train data.train_targets (
ndarray
) – A matrix of shape(train_count, response_count)
whose rows consist of response vectors of the train data.
- Return type
Tuple
[ndarray
,ndarray
]- Returns
precomputed_coefficients_matrix – A matrix of shape
(train_count, nn_count)
whose rows list the precomputed coefficients for each nearest neighbors set in the training data.nn_indices – A numpy.ndarrray supporting nearest neighbor queries.
- MuyGPyS.examples.fast_regress.make_fast_regressor(muygps, nbrs_lookup, train_features, train_targets)[source]¶
Convenience function for creating precomputed coefficient matrix and neighbor lookup data structure.
- Parameters
muygps (
MuyGPS
) – A (possibly trained) MuyGPS object.nbrs_lookup (
NN_Wrapper
) – A data structure supporting nearest neighbor queries intotrain_features
.train_features (
ndarray
) – A matrix of shape(train_count, feature_count)
whose rows consist of observation vectors of the train data.train_targets (
ndarray
) – A matrix of shape(train_count, response_count)
whose rows consist of response vectors of the train data.
- Return type
Tuple
[ndarray
,ndarray
]- Returns
precomputed_coefficients_matrix – A matrix of shape
(train_count, nn_count)
whose rows list the precomputed coefficients for each nearest neighbors set in the training data.nn_indices – A numpy.ndarrray supporting nearest neighbor queries.