encoderdecodertransformere - Infinite Lexicon - Infinite Lexicon

encoderdecodertransformere

An encoder–decoder transformer, also called a seq2seq transformer, is a neural network architecture designed for transforming one sequence into another. It uses self-attention to model dependencies within the input and output sequences and cross-attention to relate the two, enabling powerful sequence transduction without recurrent networks.

The architecture consists of two main components. The encoder is a stack of layers, each with a

Training typically uses supervised learning to maximize the likelihood of the target sequence given the input,

Variants and applications: The encoder–decoder transformer underpins many tasks, notably machine translation and text summarization, as

History and impact: Introduced by Vaswani et al. in 2017, the architecture popularized attention-based sequence modeling

a

a

self-attention,

encoder–decoder

a

a

representations,

autoregressively.

a

teacher-forcing

encoder–decoder