Stt
Stt is most commonly used as an acronym for speech-to-text, the technology that converts spoken language into written text. In computing, STT systems are a core component of automatic speech recognition (ASR), enabling real-time transcription, captions for media, and searchable transcripts across many languages.
How STT works involves three main elements: an acoustic model that maps audio features to linguistic units,
Historically, early STT relied on template matching and statistical methods. Since the 2010s, deep learning and
Applications span transcription services, live captions for broadcast, voice assistants, customer-service automation, and accessibility tools for
Common evaluation uses Word Error Rate (WER) as the standard metric, with character error rate (CER) used