personssuch
Personssuch is a coinage used in discussions of linguistics and natural language processing to describe a textual token that results from merging a two-word phrase such as “such persons” or “persons such” into a single word. This concatenation typically arises from OCR errors, auto-correct or hurried typing, but it can also appear as a deliberate stylistic variant in informal writing. In corpus work, personssuch is treated as a single token, even though it encodes a familiar multi-word unit.
Etymology and scope: The term blends the noun form “person” (often pluralized as “persons”) with the determiner
Occurrence and contexts: Personssuch is most commonly observed in scanned or digitized texts where optical character
Implications for processing: In natural language processing and information retrieval, personssuch can affect tokenization, part-of-speech tagging,
See also: tokenization, text normalization, OCR errors, corpus linguistics.