Home

topk

Topk, commonly written as top-k, refers to the operation of selecting the k elements with the highest (or sometimes the lowest) values from a collection. It is a fundamental concept in algorithms, data analysis, and machine learning, and it appears in a variety of software libraries as a primitive function that returns both the top-k elements and their positions or indices in the original data.

Definition and variants: Given a set or sequence of items with comparable keys and an integer k,

Algorithms: A straightforward method is to sort the data and take the first k elements, with a

Applications: Top-k is widely used to present results in search and information retrieval, to recommend items

See also: argmax, argmin, order statistics, selection algorithms.

the
top-k
elements
are
the
k
items
with
the
largest
values.
The
elements
can
be
returned
in
descending
order
or
in
an
arbitrary
order,
and
some
definitions
specify
how
to
handle
ties.
Variants
include
selecting
the
k
largest
or
the
k
smallest
items,
and
optionally
including
the
associated
indices
or
additional
attributes.
time
complexity
of
O(n
log
n).
More
efficient
approaches
use
a
min-heap
(priority
queue)
of
size
k
to
maintain
the
current
top-k
in
O(n
log
k)
time.
Quickselect-based
selection
achieves
nearest-linear
average
time,
enabling
O(n)
total
for
certain
top-k
extractions
when
combined
with
partitioning.
For
streaming
or
memory-constrained
scenarios,
approximate
top-k
algorithms
(for
example
Misra-Gries
or
Space-Saving)
provide
compact
summaries
with
probabilistic
guarantees.
in
systems,
and
to
identify
heavy
hitters
in
data
streams.
In
machine
learning,
top-k
is
used
as
an
evaluation
metric
for
multiclass
classification
(top-k
accuracy)
and
appears
in
neural
network
architectures
as
a
form
of
pooling
or
attention
mechanism
(k-max
pooling).