receipts1
Receipts1 is a dataset and accompanying software framework designed for research in receipt image understanding and information extraction. It provides a large collection of scanned receipts together with structured annotations and tools to support optical character recognition, layout analysis, and data extraction tasks in a reproducible way.
The dataset comprises thousands of receipt images drawn from diverse retailers and regions, including both real
Annotation is produced via a combination of automated labeling and human verification, with quality assurance metrics
Typical applications include training and benchmarking OCR systems, information extraction pipelines, receipt parsing, and cross-domain transfer
Access is provided through an open data repository and accompanying software library. Documentation covers data formats,