PageAug
PageAug is a technique for data augmentation in the context of document analysis and natural language processing. It is designed to artificially increase the size and diversity of a training dataset by generating modified versions of existing documents. The core idea behind PageAug is to simulate common variations that can occur in real-world documents without altering the fundamental semantic content.
The augmentation process typically involves applying a series of transformations to the original document pages. These
By exposing models to these augmented examples, PageAug aims to enhance their performance on tasks like document