CosAT
cosAT is a term used in computer science to denote a class of algorithms and architectures that integrate cosine similarity with attention-based neural models. It describes approaches that replace or augment standard dot-product attention with cosine-based similarity measures.
In cosAT, attention weights are computed using the cosine similarity between normalized query and key vectors,
CosAT can appear in self-attention, cross-attention, or multi-head configurations. Implementations may normalize inputs to unit length,
Reported motivations for cosAT include improved robustness to varying feature scales, better generalization across domains, and
Challenges include modest computational overhead and the need for careful normalization to avoid degenerate angles. While
See also: attention mechanism, cosine similarity, Transformer, neural network, machine learning.