Depotformer
Depotformer is a class of neural network architectures that augments transformer models with an external, trainable memory component often referred to as a "depot." The depot stores a fixed or adaptive set of memory vectors that can be read from and written to by standard attention mechanisms, enabling models to retain and retrieve information across longer contexts or multiple tasks without extending the token sequence length.
Architectural variants typically interleave transformer layers with depot access operations: input tokens attend to depot vectors
Depotformer-style designs are explored for applications that benefit from persistent or compacted knowledge, including long-context language
Depotformer approaches relate to memory networks, retrieval-augmented models, and compressive or recurrent transformer variants. Research is