Home

SNLI

SNLI, the Stanford Natural Language Inference Corpus, is a large-scale dataset created to evaluate natural language inference (NLI). It contains hundreds of thousands of sentence pairs, each consisting of a premise and a hypothesis, labeled with one of three relations: entailment, contradiction, or neutral. The premises are drawn from image captions in the Flickr30k dataset, and crowdworkers produced hypotheses that could be entailed, contradicted, or remain neutral with respect to the premise. The dataset was constructed to support training and evaluating models that determine logical relationships between sentences.

The SNLI corpus was introduced by Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning

SNLI has become a foundational benchmark in natural language processing, widely used to train and evaluate

in
2015.
It
was
built
using
Amazon
Mechanical
Turk
annotations
to
obtain
high-quality
linguistic
inferences,
with
a
focus
on
creating
a
large,
diverse,
and
balanced
set
of
labels.
The
standard
splits
include
a
training
set,
a
development
(dev)
set,
and
a
test
set,
with
the
training
portion
comprising
roughly
half
a
million
examples
and
the
dev
and
test
sets
containing
several
thousand
examples
each.
neural
architectures
such
as
recurrent
and
attention-based
models,
and
later,
transformer-based
approaches.
It
has
influenced
the
development
of
subsequent
NLI
datasets
and
tasks,
while
also
revealing
limitations
and
biases
in
crowd-sourced
annotation
and
data
structure.
Researchers
routinely
report
accuracy
on
the
SNLI
dev
and
test
sets
to
compare
model
performance
across
approaches.