PMformer
PMformer is a family of transformer-based neural networks designed to improve efficiency and scalability when processing long sequences. The architecture emphasizes memory components and permutation-based attention mechanisms to reduce computational overhead while maintaining or improving performance on sequence modeling tasks.
Core ideas include a permutation-based self-attention scheme that rearranges token interactions to approximate full attention with
PMformer has been applied to natural language processing tasks that require long-range context, such as long-form
Limitations include approximation error from permutation schemes, sensitivity to design choices, and potential memory requirements for
See also: Transformer, Longformer, Linformer, Reformer, Performer, Perceiver.