Home

TempEval

TempEval is a series of shared tasks in natural language processing designed to benchmark automatic processing of temporal information in text. Initiated by researchers in the field, TempEval aims to standardize evaluation and spur progress in extracting, normalizing, and reasoning about time in natural language. Tasks are based on the TimeML annotation framework and the TimeBank corpus to enable comparability across studies.

Core tasks typically include: temporal expression recognition and normalization (TIMEX3); event detection and classification; and extraction

Data resources mainly come from TimeBank annotated with TimeML/TIMEX3, with standard splits for training, development, and

TempEval has influenced the NLP community by providing a common benchmark for temporal information processing, guiding

of
temporal
relations
between
events
and
times
(TLINKs).
Some
editions
also
evaluate
narrative
ordering
and
cross-document
temporal
reasoning.
Systems
must
identify
when
events
occur,
their
duration,
and
how
they
relate
temporally
within
a
text.
testing.
Evaluation
uses
metrics
such
as
precision,
recall,
and
F1
for
recognition
and
classification
tasks;
TIMEX3
normalization
contributions
are
evaluated
with
value-based
scoring;
relation
extraction
uses
relation-level
accuracy
or
F1.
The
shared
task
promotes
uniform
benchmarks
and
reproducibility.
algorithm
development
and
dataset
creation.
Over
the
years,
multiple
iterations
(often
referred
to
as
TempEval-1,
-2,
-3,
-4)
have
contributed
to
advances
in
temporal
expression
processing,
event
detection,
and
temporal
reasoning.
Related
topics
include
TimeML
and
TimeBank.