Expressionsthat
Expressionsthat is not a standard term in linguistics or computer science. In practice, it refers to a tokenization artifact where the two-word sequence expressions that is, or expressions that, is concatenated into a single token. Such tokens commonly appear in raw text, optical character recognition outputs, scraped data, or informal writing, and they can hinder downstream processing.
In natural language processing and text mining, expressionsthat-type tokens pose challenges for parsing, search, and annotation.
From a linguistic perspective, the sequence expressions that frequently appears as a single token often introduces
See also: tokenization, de-tokenization, normalization, multiword expressions, natural language processing.