Crossentropy
Cross-entropy is a measure of the difference between two probability distributions. In information theory, the cross-entropy H(P, Q) between a true distribution P and a predicted distribution Q is defined as H(P, Q) = - sum_x P(x) log Q(x). The logarithm can be base e (nats) or base 2 (bits). A related quantity, the entropy H(P), depends only on P.
In machine learning, cross-entropy is commonly used as a loss function to train probabilistic models. It is
There are two widely used forms. Binary cross-entropy applies to binary or multi-label classification and for
Cross-entropy is related to KL divergence by H(P, Q) = H(P) + KL(P || Q); minimizing cross-entropy thus reduces