Passivformer
Passivformer is a term used in discussions of neural network design to denote Transformer-based models optimized for passive or energy-efficient computation. The goal is to reduce energy use, latency, and hardware demands during inference on resource-constrained devices, while preserving acceptable accuracy. Variants rely on principles such as fixed or slowly changing components, low-precision arithmetic, and sparse or event-driven processing.
Design approaches grouped under Passivformer include sparse attention patterns, kernelized attention to reduce compute, quantization and
Applications focus on mobile AI, edge devices, and real-time systems where energy and latency are critical.