recomputes - Infinite Lexicon - Infinite Lexicon

recomputes

Recomputes, in the context of automatic differentiation and neural network training, refer to the practice of re-evaluating portions of the forward computation during the backward pass instead of storing all intermediate activations. This technique, often called activation checkpointing or rematerialization, reduces peak memory usage at the cost of extra forward computations.

How it works: The forward graph is divided into segments or checkpoints. During backpropagation, only a limited

Benefits and trade-offs: Recomputes enable training with larger models or larger batch sizes by lowering memory

Implementation and usage: Common in large-scale models such as transformers. Frameworks like PyTorch offer activation checkpointing

Limitations and considerations: Recomputing must be deterministic to preserve gradient correctness; stochastic layers can complicate checkpoints.

a

differentiation

inefficiencies.