preannotation
Preannotation is the practice of generating initial annotations for a dataset before the primary human labeling pass. It combines automated labeling, heuristic rules, and domain knowledge to produce a first draft of labels that annotators review and correct. It is commonly used in natural language processing, computer vision, and biomedical data labeling, where large volumes of data or complex schemas make fully manual annotation time-consuming.
Common methods include machine-generated prelabels from classifiers, rule-based taggers, dictionaries or ontologies, and weak or distant
Benefits and risks: Preannotation can substantially reduce labeling time, improve initial consistency, and help annotators focus
Applications and considerations: Preannotation is used for creating training datasets, building annotation guidelines, and speeding curation
See also annotation guidelines, active learning, weak supervision, data labeling, and human-in-the-loop workflows.