qrels
Qrels, short for query relevance judgments, are a collection of human judgments that indicate the relevance of documents to a set of search queries. They serve as ground truth in the evaluation of information retrieval systems, enabling researchers to quantify how well a system's ranked results match user expectations.
Each line in a qrels file usually records a topic or query identifier, an iteration number (often
Qrels are generated through human annotation, often by trained assessors or crowdsourcing, sometimes using pooling methods
Evaluation uses qrels to compute metrics such as precision at k, recall, average precision, mean average precision
Qrels are dataset-specific; examples include TREC collections, LETOR, and MSMARCO. While the general concept is common,