Home

Raters

Raters are individuals who evaluate or judge items by assigning a score or category. They collect subjective assessments that are difficult to measure directly, such as perceived quality, usefulness, or relevance. Raters are used in research, product testing, and data labeling for machine learning systems.

Raters can be categorized by expertise and setting. Expert raters are specialists in a field, while crowd

Raters receive training and calibration to improve reliability. They use structured rating scales, such as Likert

Quality control includes gold-standard items, qualification tests, and aggregating ratings by mean, median, or consensus. Ethical

or
crowdworkers
are
non-experts
recruited
via
platforms
to
provide
large-scale
judgments
at
lower
cost.
Applications
include
consumer
product
testing,
translation
quality
assessment,
image
and
video
quality
evaluation,
sentiment
annotation,
and
relevance
judgments
in
information
retrieval
and
content
moderation.
scales,
semantic
differentials,
or
numeric
quality
scores.
To
assess
consistency,
researchers
measure
inter-rater
reliability
with
statistics
like
Cohen's
kappa
for
two
raters,
Fleiss'
kappa
for
many
raters,
or
Krippendorff's
alpha.
considerations
include
fair
compensation,
respect
for
privacy,
and
minimizing
bias
and
fatigue.
Clear
instructions
and
documentation
help
ensure
reproducibility
and
auditability.