IDFs
IDFs, short for inverse document frequency, is a statistic used in information retrieval to assess how informative a term is across a document collection. It helps distinguish terms that are common across many documents from those that are relatively unique to a subset of the corpus.
Calculation and interpretation: For a corpus with N documents, the document frequency df(t) is the number of
Relation to tf-idf: IDF is a key component of the tf-idf weighting scheme, where the weight of
Variants and limitations: Some approaches use adjusted formulas, such as BM25’s IDF component, e.g., log((N - df