loss

Loss Function Handling

MuyGPyS includes predefined loss functions and convenience functions for indicating them to optimization.

class MuyGPyS.optimize.loss.LossFn(loss_fn, make_predict_and_loss_fn)[source]

Loss functor class.

MuyGPyS-compatible loss functions are objects of this class. Creating a new loss function is as simple as instantiation a new LossFn object.

Parameters:
  • loss_fn (Callable) – A Callable with signature (predictions, targets, **kwargs) or (predictions, targets, variances, scale, **kwargs) tha computes a floating-point loss score of a set of predictions given posterior means and possibly posterior variances. Individual loss functions can implement different kwargs as needed.

  • make_precit_and_loss_fn – A Callable with signature (loss_fn, mean_fn, var_fn, scale_fn, batch_nn_targets, batch_targets, **loss_kwargs) that produces a function that computes posterior predictions and scores them using the loss function. :func:~MuyGPyS.optimize.loss._make_raw_predict_and_loss_fn` and :func:~MuyGPyS.optimize.loss._make_var_predict_and_loss_fn` are two candidates.

MuyGPyS.optimize.loss.cross_entropy_fn = <MuyGPyS.optimize.loss.LossFn object>

Cross entropy function.

Computes the cross entropy loss the predicted versus known response. Transforms predictions to be row-stochastic, and ensures that targets contains no negative elements. Only defined for two or more labels. For a sample with true labels \(y_i \in \{0, 1\}\) and estimates \(\bar{\mu(x_i)} = \textrm{Pr}(y = 1)\), the function computes

\[\ell_\textrm{cross-entropy}(\bar{\mu}, y) = \sum_{i=1}^{b} y_i \log(\bar{\mu}_i) - (1 - y_i) \log(1 - \bar{\mu}_i).\]

The numpy backend uses sklearn’s implementation.

Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • eps – Probabilities are clipped to the range [eps, 1 - eps].

Returns:

The cross-entropy loss of the prediction.

MuyGPyS.optimize.loss.lool_fn = <MuyGPyS.optimize.loss.LossFn object>

Leave-one-out likelihood function.

Computes leave-one-out likelihood (LOOL) loss of the predicted versus known response. Treats multivariate outputs as interchangeable in terms of loss penalty. The function computes

\[\ell_\textrm{lool}(\bar{\mu}, y \mid \bar{\Sigma}) = \sum_{i=1}^b \sum_{j=1}^s \left ( \frac{\bar{\mu}_i - y}{\bar{\Sigma}_{ii}} \right )_j^2 + \left ( \log \bar{\Sigma}_{ii} \right )_j\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • variances – The unscaled variance of the predicted responses of shape (batch_count, response_count).

  • scale – The scale variance scaling parameter of shape (response_count,).

Returns:

The LOOL loss of the prediction.

MuyGPyS.optimize.loss.lool_fn_unscaled = <MuyGPyS.optimize.loss.LossFn object>

Leave-one-out likelihood function.

Computes leave-one-out likelihood (LOOL) loss of the predicted versus known response. Treats multivariate outputs as interchangeable in terms of loss penalty. Unlike lool_fn, does not require scale as an argument. The function computes

\[\ell_\textrm{lool}(\bar{\mu}, y \mid \bar{\Sigma}) = \sum_{i=1}^b \frac{(\bar{\mu}_i - y)^2}{\bar{\Sigma}_{ii}} + \log \bar{\Sigma}_{ii}.\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • variances – The unscaled variance of the predicted responses of shape (batch_count, response_count).

Returns:

The LOOL loss of the prediction.

MuyGPyS.optimize.loss.looph_fn = <MuyGPyS.optimize.loss.LossFn object>

Variance-regularized pseudo-Huber loss function.

Computes a smooth approximation to the Huber loss function, similar to pseudo_huber_fn(), with the addition of both a variance scaling and a additive logarithmic variance regularization term to avoid exploding the variance. The function computes

\[\ell_\textrm{lool}(\bar{\mu}, y \mid \delta, \bar{\Sigma}) = \sum_{i=1}^b \sum_{i=1}^s \delta^2 \left ( \sqrt{ 1 + \left ( \frac{y_i - \bar{\mu}_i}{\delta \bar{\Sigma}_{ii}} \right )_j^2 } - 1 \right ) + \left ( \log \bar{\Sigma}_{ii} \right )_j.\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • variances – The unscaled variance of the predicted responses of shape (batch_count, response_count).

  • scale – The scale variance scaling parameter of shape (response_count,).

  • boundary_scale – The boundary value for the residual beyond which the loss becomes approximately linear. Useful values depend on the scale of the response.

Returns:

The sum of leave-one-out pseudo-Huber losses of the predictions.

MuyGPyS.optimize.loss.make_raw_predict_and_loss_fn(loss_fn, mean_fn, var_fn, scale_fn, batch_nn_targets, batch_targets, **loss_kwargs)[source]

Make a predict_and_loss function that depends only on the posterior mean.

Assembles a new function with signature (K, Kcross, *args, **kwargs) that computes the posterior mean and uses the passed loss_fn to score it against the batch targets.

Parameters:
  • loss_fn (Callable) – A loss function Callable with signature (predictions, responses, **kwargs), where predictions and targets are matrices of shape (batch_count, response_count).

  • mean_fn (Callable) – A MuyGPS posterior mean function Callable with signature (K, Kcross, batch_nn_targets), which are tensors of shape (batch_count, nn_count, nn_count), (batch_count, nn_count), and (batch_count, nn_count, response_count), respectively.

  • var_fn (Callable) – A MuyGPS posterior variance function Callable with signature (K, Kcross), which are tensors of shape (batch_count, nn_count, nn_count) and (batch_count, nn_count), respectively. Unused by this function, but still required by the signature.

  • scale_fn (Callable) – A MuyGPS scale optimization function Callable with signature (K, batch_nn_targets), which are tensors of shape (batch_count, nn_count, nn_count) and (batch_count, nn_count, response_count), respectively. Unused by this function, but still required by the signature.

  • batch_nn_targets (ndarray) – A tensor of shape (batch_count, nn_count, response_count) containing the expected response of the nearest neighbors of each batch element.

  • batch_targets (ndarray) – A matrix of shape (batch_count, response_count) containing the expected response of each batch element.

  • loss_kwargs – Additionall keyword arguments used by the loss function.

Return type:

Callable

Returns:

A Callable with signature (K, Kcross, *args, **kwargs) -> float that computes the posterior mean and applies the loss function to it and the batch_targets.

MuyGPyS.optimize.loss.make_var_predict_and_loss_fn(loss_fn, mean_fn, var_fn, scale_fn, batch_nn_targets, batch_targets, **loss_kwargs)[source]

Make a predict_and_loss function that depends on the posterior mean and variance.

Assembles a new function with signature (K, Kcross, *args, **kwargs) that computes the posterior mean and variance and uses the passed loss_fn to score them against the batch targets.

Parameters:
  • loss_fn (Callable) – A loss function Callable with signature (predictions, responses, **kwargs), where predictions and targets are matrices of shape (batch_count, response_count).

  • mean_fn (Callable) – A MuyGPS posterior mean function Callable with signature (K, Kcross, batch_nn_targets), which are tensors of shape (batch_count, nn_count, nn_count), (batch_count, nn_count), and (batch_count, nn_count, response_count), respectively.

  • var_fn (Callable) – A MuyGPS posterior variance function Callable with signature (K, Kcross), which are tensors of shape (batch_count, nn_count, nn_count) and (batch_count, nn_count), respectively.

  • scale_fn (Callable) – A MuyGPS scale optimization function Callable with signature (K, batch_nn_targets), which are tensors of shape (batch_count, nn_count, nn_count) and (batch_count, nn_count, response_count), respectively.

  • batch_nn_targets (ndarray) – A tensor of shape (batch_count, nn_count, response_count) containing the expected response of the nearest neighbors of each batch element.

  • batch_targets (ndarray) – A matrix of shape (batch_count, response_count) containing the expected response of each batch element.

  • loss_kwargs – Additionall keyword arguments used by the loss function.

Return type:

Callable

Returns:

A Callable with signature (K, Kcross, *args, **kwargs) -> float that computes the posterior mean and applies the loss function to it and the batch_targets.

MuyGPyS.optimize.loss.mse_fn = <MuyGPyS.optimize.loss.LossFn object>

Mean squared error function.

Computes mean squared error loss of the predicted versus known response. Treats multivariate outputs as interchangeable in terms of loss penalty. The function computes

\[\ell_\textrm{MSE}(\bar{\mu}, y) = \frac{1}{b} \sum_{i=1}^b (\bar{\mu}_i - y)^2.\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

Returns:

The mse loss of the prediction.

MuyGPyS.optimize.loss.pseudo_huber_fn = <MuyGPyS.optimize.loss.LossFn object>

Pseudo-Huber loss function.

Computes a smooth approximation to the Huber loss function, which balances sensitive squared-error loss for relatively small errors and robust-to-outliers absolute loss for larger errors, so that the loss is not overly sensitive to outliers. Uses the form from wikipedia. The function computes

\[\ell_\textrm{Pseudo-Huber}(\bar{\mu}, y \mid \delta) = \sum_{i=1}^b \delta^2 \left ( \sqrt{ 1 + \left ( \frac{y_i - \bar{\mu}_i}{\delta} \right )^2 } - 1 \right ).\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • boundary_scale – The boundary value for the residual beyond which the loss becomes approximately linear. Useful values depend on the scale of the response.

Returns:

The sum of pseudo-Huber losses of the predictions.

MuyGPyS.optimize.loss.cross_entropy_fn = <MuyGPyS.optimize.loss.LossFn object>

Cross entropy function.

Computes the cross entropy loss the predicted versus known response. Transforms predictions to be row-stochastic, and ensures that targets contains no negative elements. Only defined for two or more labels. For a sample with true labels \(y_i \in \{0, 1\}\) and estimates \(\bar{\mu(x_i)} = \textrm{Pr}(y = 1)\), the function computes

\[\ell_\textrm{cross-entropy}(\bar{\mu}, y) = \sum_{i=1}^{b} y_i \log(\bar{\mu}_i) - (1 - y_i) \log(1 - \bar{\mu}_i).\]

The numpy backend uses sklearn’s implementation.

Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • eps – Probabilities are clipped to the range [eps, 1 - eps].

Returns:

The cross-entropy loss of the prediction.

MuyGPyS.optimize.loss.mse_fn = <MuyGPyS.optimize.loss.LossFn object>

Mean squared error function.

Computes mean squared error loss of the predicted versus known response. Treats multivariate outputs as interchangeable in terms of loss penalty. The function computes

\[\ell_\textrm{MSE}(\bar{\mu}, y) = \frac{1}{b} \sum_{i=1}^b (\bar{\mu}_i - y)^2.\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

Returns:

The mse loss of the prediction.

MuyGPyS.optimize.loss.lool_fn = <MuyGPyS.optimize.loss.LossFn object>

Leave-one-out likelihood function.

Computes leave-one-out likelihood (LOOL) loss of the predicted versus known response. Treats multivariate outputs as interchangeable in terms of loss penalty. The function computes

\[\ell_\textrm{lool}(\bar{\mu}, y \mid \bar{\Sigma}) = \sum_{i=1}^b \sum_{j=1}^s \left ( \frac{\bar{\mu}_i - y}{\bar{\Sigma}_{ii}} \right )_j^2 + \left ( \log \bar{\Sigma}_{ii} \right )_j\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • variances – The unscaled variance of the predicted responses of shape (batch_count, response_count).

  • scale – The scale variance scaling parameter of shape (response_count,).

Returns:

The LOOL loss of the prediction.

MuyGPyS.optimize.loss.pseudo_huber_fn = <MuyGPyS.optimize.loss.LossFn object>

Pseudo-Huber loss function.

Computes a smooth approximation to the Huber loss function, which balances sensitive squared-error loss for relatively small errors and robust-to-outliers absolute loss for larger errors, so that the loss is not overly sensitive to outliers. Uses the form from wikipedia. The function computes

\[\ell_\textrm{Pseudo-Huber}(\bar{\mu}, y \mid \delta) = \sum_{i=1}^b \delta^2 \left ( \sqrt{ 1 + \left ( \frac{y_i - \bar{\mu}_i}{\delta} \right )^2 } - 1 \right ).\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • boundary_scale – The boundary value for the residual beyond which the loss becomes approximately linear. Useful values depend on the scale of the response.

Returns:

The sum of pseudo-Huber losses of the predictions.

MuyGPyS.optimize.loss.looph_fn = <MuyGPyS.optimize.loss.LossFn object>

Variance-regularized pseudo-Huber loss function.

Computes a smooth approximation to the Huber loss function, similar to pseudo_huber_fn(), with the addition of both a variance scaling and a additive logarithmic variance regularization term to avoid exploding the variance. The function computes

\[\ell_\textrm{lool}(\bar{\mu}, y \mid \delta, \bar{\Sigma}) = \sum_{i=1}^b \sum_{i=1}^s \delta^2 \left ( \sqrt{ 1 + \left ( \frac{y_i - \bar{\mu}_i}{\delta \bar{\Sigma}_{ii}} \right )_j^2 } - 1 \right ) + \left ( \log \bar{\Sigma}_{ii} \right )_j.\]
Parameters:
  • predictions – The predicted response of shape (batch_count, response_count).

  • targets – The expected response of shape (batch_count, response_count).

  • variances – The unscaled variance of the predicted responses of shape (batch_count, response_count).

  • scale – The scale variance scaling parameter of shape (response_count,).

  • boundary_scale – The boundary value for the residual beyond which the loss becomes approximately linear. Useful values depend on the scale of the response.

Returns:

The sum of leave-one-out pseudo-Huber losses of the predictions.