Selu
SELU, short for Scaled Exponential Linear Unit, is an activation function designed to promote self-normalizing behavior in deep neural networks. It was introduced in 2017 by Klambauer, Unterthiner, Mayr, and Nadler in the paper Self-Normalizing Neural Networks. The goal is to keep activations with near-zero mean and unit variance across layers, reducing or eliminating the need for explicit normalization layers in some architectures.
Mathematically, SELU is defined as follows: f(x) = lambda * x for x > 0, and f(x) = lambda * alpha
SELU is intended to enable self-normalizing neural networks, which can reduce reliance on batch normalization in
Practical considerations include ensuring inputs are standardized and using the recommended initialization. While SELU can improve