Home

rankingmetrics

Rankingmetrics are quantitative measures used to assess the quality of ordered lists produced by information retrieval, search engines, and recommender systems. They compare a system-generated ranking to a ground-truth ranking of item relevance. Ranking metrics serve both as evaluation tools and as optimization objectives in learning-to-rank, a family of approaches that trains models to produce better ranked results.

Ranking metrics fall into several families. Pointwise metrics evaluate each item independently; pairwise metrics compare pairs

- Precision@K and Recall@K, which measure the proportion of relevant items within the top K positions and

- Average Precision (AP) and Mean Average Precision (MAP), where AP is the average of precisions at

- Discounted Cumulative Gain (DCG) and Normalized DCG (NDCG). DCG sums relevance with a logarithmic discount by

- Reciprocal Rank (RR) and Mean Reciprocal Rank (MRR), based on the rank of the first relevant item.

- Rank correlation measures such as Spearman’s rho and Kendall’s tau assess agreement between predicted and true

- AUC or area under the ROC curve is used in score-based rankings as a measure of discrimination.

Choice of metric depends on the task, data, and whether top-k quality or overall ranking is most

of
items;
and
listwise
metrics
assess
the
entire
ranked
list.
Common
examples
include:
the
proportion
of
all
relevant
items
recovered
in
the
top
K,
respectively.
ranks
of
relevant
items
and
MAP
is
the
mean
of
AP
over
queries.
position;
NDCG
normalizes
by
the
ideal
DCG
(IDCG)
to
yield
a
score
between
0
and
1.
rankings.
important.
Ground-truth
judgments
can
be
binary
or
graded,
affecting
metric
suitability.
Metrics
can
guide
model
selection
and
hyperparameter
tuning
in
learning-to-rank,
but
they
also
reflect
biases
in
the
judgments
and
may
be
sensitive
to
ties
and
missing
data.