Home

insertionbased

Insertionbased is a term used in natural language processing to describe a family of sequence generation methods in which the output is constructed by iteratively inserting tokens into a partially built sequence, rather than predicting tokens strictly from left to right. The approach enables non-sequential generation and can leverage parallel insertions, potentially speeding up decoding while maintaining competitive output quality.

The best-known example is the Insertion Transformer, introduced as an alternative to left-to-right autoregressive decoding. Insertion-based

How it works: a generation process starts from a seed sequence, often a start token. The model

Advantages and limitations: insertion-based methods can offer faster decoding due to parallel insertions and can better

Applications and scope: these models have been explored for language generation, machine translation, summarization, and code

models
generate
text
by
inserting
tokens
at
chosen
positions
in
the
current
sequence
rather
than
extending
it
one
token
at
a
time
from
the
end.
proposes
insertions:
a
set
of
positions
between
existing
tokens
and
tokens
to
be
inserted,
along
with
probabilities.
In
a
single
decoding
step,
multiple
tokens
can
be
inserted,
and
after
several
steps
the
final
sequence
emerges.
Training
typically
uses
teacher
forcing
to
align
predicted
insertions
with
the
target
sequence,
with
the
model
learning
an
insertion
plan
or
order.
Variants
explore
different
planners
or
hierarchical
schemes
to
organize
insertions.
capture
non-local
dependencies
or
alternative
token
orders.
They
also
allow
flexible
output
lengths.
However,
they
involve
more
complex
training
and
inference
machinery,
require
reliable
insertion
plans,
and
may
underperform
baseline
left-to-right
models
without
careful
design
and
tuning
in
some
tasks.
generation,
among
other
sequence-generation
tasks.
Related
concepts
include
non-autoregressive
generation
and
traditional
left-to-right
autoregressive
models.