LRS2LRS3

LRS2 and LRS3, collectively referred to as the Lip Reading Sentences series, are large-scale public datasets created to support research in visual speech recognition and lip reading. They provide video data of people speaking along with aligned textual transcriptions, enabling the development and evaluation of models that interpret speech from visual input.

LRS2 (Lip Reading Sentences 2) is built from publicly available video sources such as broadcast programs. The

LRS3 (Lip Reading Sentences 3) extends the scope of LRS2 by increasing data diversity and scale. It

Both datasets are used to train and evaluate audiovisual or visual-only speech recognition systems. Evaluation commonly

See also related lip-reading datasets and benchmarks in visual speech recognition literature.

a

a

a

a

representations

a

reproducibility