Crossentropy

Cross-entropy is a measure of the difference between two probability distributions. In information theory, the cross-entropy H(P, Q) between a true distribution P and a predicted distribution Q is defined as H(P, Q) = - sum_x P(x) log Q(x). The logarithm can be base e (nats) or base 2 (bits). A related quantity, the entropy H(P), depends only on P.

In machine learning, cross-entropy is commonly used as a loss function to train probabilistic models. It is

There are two widely used forms. Binary cross-entropy applies to binary or multi-label classification and for

Cross-entropy is related to KL divergence by H(P, Q) = H(P) + KL(P || Q); minimizing cross-entropy thus reduces

a

a

Q

a

-[

y

p

+

-

-

],

y

p

-

a