chrF
ChrF, short for character F-score, is an automatic evaluation metric used to assess the quality of machine translation and other text generation outputs. It measures the overlap of character n-grams between a system’s hypothesis and one or more reference translations by computing a precision, a recall, and a resulting F-score. By focusing on character sequences rather than words, ChrF is language-agnostic and more robust to morphological variation, typos, and minor spelling differences, making it especially useful for morphologically rich languages.
Calculation is based on character n-grams. A typical configuration uses n-grams from 2 to 6 characters, though
ChrF is widely used in machine translation evaluation, including in major conferences and shared tasks, and