Gumbelsoftmax

The Gumbel-softmax, also known as the Concrete distribution in some contexts, is a differentiable relaxation of a categorical distribution that enables gradient-based optimization when working with discrete variables. It provides a continuous approximation to sampling from a finite set of categories, which is useful for neural networks trained with backpropagation.

Mechanism and formulation: Given a vector of unnormalized log-probabilities (logits) z = (z1, ..., zk) for a categorical

Relation to sampling tricks: The approach is closely related to the Gumbel-Max trick, which uses argmax of

Applications and limitations: It is used for training models with discrete latent variables, variational autoencoders with

~

=

+

/

>

0

a

y

y

a

a

+

a

non-differentiable

a

A

straight-through

a

a

reparameterization