loss

Loss Function Handling

MuyGPyS includes predefined loss functions and convenience functions for indicating them to optimization.

class MuyGPyS.optimize.loss.LossFn(loss_fn, make_predict_and_loss_fn)[source]

Loss functor class.

MuyGPyS-compatible loss functions are objects of this class. Creating a new loss function is as simple as instantiation a new LossFn object.

Parameters:
  • loss_fn (Callable) – A Callable with signature (predictions, targets, **kwargs) or (predictions, targets, variances, scale, **kwargs) tha computes a floating-point loss score of a set of predictions given posterior means and possibly posterior variances. Individual loss functions can implement different kwargs as needed.

  • make_precit_and_loss_fn – A Callable with signature (loss_fn, mean_fn, var_fn, scale_fn, batch_nn_targets, batch_targets, **loss_kwargs) that produces a function that computes posterior predictions and scores them using the loss function. :func:~MuyGPyS.optimize.loss._make_raw_predict_and_loss_fn` and :func:~MuyGPyS.optimize.loss._make_var_predict_and_loss_fn` are two candidates.

MuyGPyS.optimize.loss.cross_entropy_fn = <MuyGPyS.optimize.loss.LossFn object>

Cross entropy function.

Computes the cross entropy loss the predicted versus known response. Transforms predictions to be row-stochastic, and ensures that targets contains no negative elements. Only defined for two or more labels. For a sample with true labels \(y_i \in \{0, 1\}\) and estimates \(\bar{\mu(x_i)} = \textrm{Pr}(y = 1)\), the function computes

\[\ell_\textrm{cross-entropy}(\bar{\mu}, y) = \sum_{i=1}^{b} y_i \log(\bar{\mu}_i) - (1 - y_i) \log(1 - \bar{\mu}_i).\]

The numpy backend uses sklearn’s implementation.

Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • eps – Probabilities are clipped to the range [eps, 1 - eps].

Returns:

The cross-entropy loss of the prediction.

MuyGPyS.optimize.loss.lool_fn = <MuyGPyS.optimize.loss.LossFn object>

Leave-one-out likelihood function.

Computes leave-one-out likelihood (LOOL) loss of the predicted versus known response. Treats multivariate outputs as interchangeable in terms of loss penalty. The function computes

\[\ell_\textrm{lool}(\bar{\mu}, y, \bar{\Sigma}) = \sum_{i=1}^b \sum_{j=1}^s \left ( \frac{(\bar{\mu}_i - y)^2}{\bar{\Sigma}_{ii}} \right )_j + \left ( \log \bar{\Sigma}_{ii} \right )_j\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • variances – The unscaled variance of the predicted responses of shape (batch_count, response_count).

  • scale – The scale variance scaling parameter of shape (response_count,).

Returns:

The LOOL loss of the prediction.

MuyGPyS.optimize.loss.lool_fn_unscaled = <MuyGPyS.optimize.loss.LossFn object>

Leave-one-out likelihood function.

Computes leave-one-out likelihood (LOOL) loss of the predicted versus known response. Treats multivariate outputs as interchangeable in terms of loss penalty. Unlike lool_fn, does not require scale as an argument. The function computes

\[\ell_\textrm{lool}(\bar{\mu}, y, \bar{\Sigma}) = \sum_{i=1}^b \frac{(\bar{\mu}_i - y)^2}{\bar{\Sigma}_{ii}} + \log \bar{\Sigma}_{ii}.\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • variances – The unscaled variance of the predicted responses of shape (batch_count, response_count).

Returns:

The LOOL loss of the prediction.

MuyGPyS.optimize.loss.looph_fn = <MuyGPyS.optimize.loss.LossFn object>

Variance-regularized pseudo-Huber loss function.

Computes a smooth approximation to the Huber loss function, similar to pseudo_huber_fn(), with the addition of both a variance scaling and a additive logarithmic variance regularization term to avoid exploding the variance. This is intended to be an outlier-robust replacement to the lool function. Similar to pseudo-Huber, the boundary_scale parameter is sensitive to the units of the responses, and must be set accordingly. The function computes

\[\ell_\textrm{lool}(\bar{\mu}, y, \bar{\Sigma} \mid \delta) = \sum_{i=1}^b \sum_{j=1}^s 2\delta^2 \left ( \sqrt{ 1 + \left ( \frac{(y_i - \bar{\mu}_i)^2}{\delta^2 \bar{\Sigma}_{ii}} \right )_j } - 1 \right ) + \left ( \log \bar{\Sigma}_{ii} \right )_j.\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • variances – The unscaled variance of the predicted responses of shape (batch_count, response_count).

  • scale – The scale variance scaling parameter of shape (response_count,).

  • boundary_scale – The boundary value for the residual beyond which the loss becomes approximately linear. Corresponds to the number of standard deviations beyond which to linearize the loss. The default value is 3.0, which is sufficiently tight for most realistic problems.

Returns:

The sum of leave-one-out pseudo-Huber losses of the predictions.

MuyGPyS.optimize.loss.make_raw_predict_and_loss_fn(loss_fn, mean_fn, var_fn, scale_fn, batch_nn_targets, batch_targets, **loss_kwargs)[source]

Make a predict_and_loss function that depends only on the posterior mean.

Assembles a new function with signature (K, Kcross, *args, **kwargs) that computes the posterior mean and uses the passed loss_fn to score it against the batch targets.

Parameters:
  • loss_fn (Callable) – A loss function Callable with signature (predictions, responses, **kwargs), where predictions and targets are matrices of shape (batch_count, response_count).

  • mean_fn (Callable) – A MuyGPS posterior mean function Callable with signature (K, Kcross, batch_nn_targets), which are tensors of shape (batch_count, nn_count, nn_count), (batch_count, nn_count), and (batch_count, nn_count, response_count), respectively.

  • var_fn (Callable) – A MuyGPS posterior variance function Callable with signature (K, Kcross), which are tensors of shape (batch_count, nn_count, nn_count) and (batch_count, nn_count), respectively. Unused by this function, but still required by the signature.

  • scale_fn (Callable) – A MuyGPS scale optimization function Callable with signature (K, batch_nn_targets), which are tensors of shape (batch_count, nn_count, nn_count) and (batch_count, nn_count, response_count), respectively. Unused by this function, but still required by the signature.

  • batch_nn_targets (ndarray) – A tensor of shape (batch_count, nn_count, response_count) containing the expected response of the nearest neighbors of each batch element.

  • batch_targets (ndarray) – A matrix of shape (batch_count, response_count) containing the expected response of each batch element.

  • loss_kwargs – Additionall keyword arguments used by the loss function.

Return type:

Callable

Returns:

A Callable with signature (K, Kcross, *args, **kwargs) -> float that computes the posterior mean and applies the loss function to it and the batch_targets.

MuyGPyS.optimize.loss.make_var_predict_and_loss_fn(loss_fn, mean_fn, var_fn, scale_fn, batch_nn_targets, batch_targets, **loss_kwargs)[source]

Make a predict_and_loss function that depends on the posterior mean and variance.

Assembles a new function with signature (K, Kcross, *args, **kwargs) that computes the posterior mean and variance and uses the passed loss_fn to score them against the batch targets.

Parameters:
  • loss_fn (Callable) – A loss function Callable with signature (predictions, responses, **kwargs), where predictions and targets are matrices of shape (batch_count, response_count).

  • mean_fn (Callable) – A MuyGPS posterior mean function Callable with signature (K, Kcross, batch_nn_targets), which are tensors of shape (batch_count, nn_count, nn_count), (batch_count, nn_count), and (batch_count, nn_count, response_count), respectively.

  • var_fn (Callable) – A MuyGPS posterior variance function Callable with signature (K, Kcross), which are tensors of shape (batch_count, nn_count, nn_count) and (batch_count, nn_count), respectively.

  • scale_fn (Callable) – A MuyGPS scale optimization function Callable with signature (K, batch_nn_targets), which are tensors of shape (batch_count, nn_count, nn_count) and (batch_count, nn_count, response_count), respectively.

  • batch_nn_targets (ndarray) – A tensor of shape (batch_count, nn_count, response_count) containing the expected response of the nearest neighbors of each batch element.

  • batch_targets (ndarray) – A matrix of shape (batch_count, response_count) containing the expected response of each batch element.

  • loss_kwargs – Additionall keyword arguments used by the loss function.

Return type:

Callable

Returns:

A Callable with signature (K, Kcross, *args, **kwargs) -> float that computes the posterior mean and applies the loss function to it and the batch_targets.

MuyGPyS.optimize.loss.mse_fn = <MuyGPyS.optimize.loss.LossFn object>

Mean squared error function.

Computes mean squared error loss of the predicted versus known response. Treats multivariate outputs as interchangeable in terms of loss penalty. The function computes

\[\ell_\textrm{MSE}(\bar{\mu}, y) = \frac{1}{b} \sum_{i=1}^b (\bar{\mu}_i - y)^2.\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

Returns:

The mse loss of the prediction.

MuyGPyS.optimize.loss.pseudo_huber_fn = <MuyGPyS.optimize.loss.LossFn object>

Pseudo-Huber loss function.

Computes a smooth approximation to the Huber loss function, which balances sensitive squared-error loss for relatively small errors and robust-to-outliers absolute loss for larger errors, so that the loss is not overly sensitive to outliers. Uses the form from wikipedia. The function computes

\[\ell_\textrm{Pseudo-Huber}(\bar{\mu}, y \mid \delta) = \sum_{i=1}^b \delta^2 \left ( \sqrt{ 1 + \left ( \frac{y_i - \bar{\mu}_i}{\delta} \right )^2 } - 1 \right ).\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • boundary_scale – The boundary value for the residual beyond which the loss becomes approximately linear. Useful values depend on the scale of the response.

Returns:

The sum of pseudo-Huber losses of the predictions.