regularizer
A regularizer, in statistics and machine learning, is a component added to a loss function to constrain model complexity and improve generalization. Formally, with a loss L(θ) on training data and a parameter vector θ, a regularized objective takes the form L(θ) + λR(θ), where R is a penalty function and λ ≥ 0 is the regularization strength. By increasing λ, the model is discouraged from adopting large or complex parameter values.
Common regularizers include L2 (R(θ) = ||θ||2^2), also called ridge, which tends to shrink weights toward zero
From a Bayesian perspective, regularization corresponds to placing priors on the parameters. For example, a Gaussian
Optimization and computation vary by regularizer. L1 is convex but non-differentiable at zero, often handled by