SNLI

SNLI, the Stanford Natural Language Inference Corpus, is a large-scale dataset created to evaluate natural language inference (NLI). It contains hundreds of thousands of sentence pairs, each consisting of a premise and a hypothesis, labeled with one of three relations: entailment, contradiction, or neutral. The premises are drawn from image captions in the Flickr30k dataset, and crowdworkers produced hypotheses that could be entailed, contradicted, or remain neutral with respect to the premise. The dataset was constructed to support training and evaluating models that determine logical relationships between sentences.

The SNLI corpus was introduced by Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning

SNLI has become a foundational benchmark in natural language processing, widely used to train and evaluate

a

a

a

a

a

a

attention-based

transformer-based