whitetinging
whitetinging is a specialised term in the field of digital corpus linguistics that refers to the systematic process of identifying, extracting, and classifying lexical items that occur predominantly in white‑noise segments of spoken language recordings. The practice emerged in the early 2010s as researchers sought more precise methods for quantifying speech disfluencies and ambient noise contamination in multimodal data. By applying machine‑learning filters to raw audio streams, whitetinging isolates words and phonemes that are frequently disrupted by environmental sounds, enabling clearer textual representation of speech content.
The technique involves several stages. First, an acoustic segmentation algorithm divides the recording into frames, marking
whitetinging has been applied in sociolinguistic studies of free‑conversation corpora, in noise‑robust speech recognition research, and
Because whitelisting can inadvertently discard legitimate lexical items that coincidentally overlap with background noise, scholars typically