TermExtraction
Term extraction, also known as keyword extraction or keyphrase extraction, is the task of automatically identifying terms or phrases that concisely characterize the content of a document or corpus. The output typically consists of noun phrases or domain-specific expressions and is used for indexing, search, summarization, topic modeling, and knowledge base construction. Term extraction can be applied to single documents or large collections and commonly supports monolingual and multilingual data.
Methods used in term extraction fall into statistical, linguistic, and hybrid categories. Statistical approaches rely on
Evaluation of term extraction typically uses gold-standard term lists and metrics such as precision, recall, and
Applications encompass document indexing and retrieval, search and recommendation, ontology population, terminology management, and assistive authoring