softmaxi
Softmaxi is a term used in theoretical discussions to denote a family of normalization functions that map a real-valued vector to a probability distribution, analogous to softmax but with adjustable sharpening. The common form defines a vector z in R^K and a temperature parameter tau > 0, with softmaxi_tau(z)_i = exp(z_i / tau) / sum_j exp(z_j / tau). When tau = 1, it coincides with the standard softmax; as tau decreases toward zero, it becomes more peaked and approaches an argmax; as tau increases, it tends toward a uniform distribution.
An alternative variant introduces per-element temperatures t_i > 0, giving softmaxi(z)_i = exp(z_i / t_i) / sum_j exp(z_j / t_j). This
Relationships and properties: softmaxi preserves differentiability and outputs a probability vector summing to 1. It shares
Applications and considerations: softmaxi can be used as a drop-in replacement in neural networks where tunable
See also: Softmax, temperature scaling, log-softmax, attention mechanism.