Model loss functions

loss_mean_squared_error(y_true, y_pred) loss_mean_absolute_error(y_true, y_pred) loss_mean_absolute_percentage_error(y_true, y_pred) loss_mean_squared_logarithmic_error(y_true, y_pred) loss_squared_hinge(y_true, y_pred) loss_hinge(y_true, y_pred) loss_categorical_hinge(y_true, y_pred) loss_logcosh(y_true, y_pred) loss_categorical_crossentropy(y_true, y_pred) loss_sparse_categorical_crossentropy(y_true, y_pred) loss_kullback_leibler_divergence(y_true, y_pred) loss_poisson(y_true, y_pred) loss_cosine_proximity(y_true, y_pred) loss_cosine_similarity(y_true, y_pred)

y_true | True labels (Tensor) |
---|---|

y_pred | Predictions (Tensor of the same shape as |

Loss functions are to be supplied in the `loss`

parameter of the
`compile.keras.engine.training.Model()`

function.

Loss functions can be specified either using the name of a built in loss function (e.g. 'loss = binary_crossentropy'), a reference to a built in loss function (e.g. 'loss = loss_binary_crossentropy()') or by passing an artitrary function that returns a scalar for each data-point and takes the following two arguments:

`y_true`

True labels (Tensor)`y_pred`

Predictions (Tensor of the same shape as`y_true`

)

The actual optimized objective is the mean of the output array across all datapoints.

When using the categorical_crossentropy loss, your targets should be in
categorical format (e.g. if you have 10 classes, the target for each sample
should be a 10-dimensional vector that is all-zeros except for a 1 at the
index corresponding to the class of the sample). In order to convert
integer targets into categorical targets, you can use the Keras utility
function `to_categorical()`

:

`categorical_labels <- to_categorical(int_labels, num_classes = NULL)`

`log(cosh(x))`

is approximately equal to `(x ** 2) / 2`

for small `x`

and
to `abs(x) - log(2)`

for large `x`

. This means that 'logcosh' works mostly
like the mean squared error, but will not be so strongly affected by the
occasional wildly incorrect prediction. However, it may return NaNs if the
intermediate value `cosh(y_pred - y_true)`

is too large to be represented
in the chosen precision.