Home

CoNLL2012

CoNLL-2012, the English coreference resolution track of the Conference on Computational Linguistics (CoNLL), was the 2012 edition of the CoNLL shared tasks. It focused on developing systems that identify mentions and determine which ones refer to the same real-world entities.

Data for the task was drawn from OntoNotes 5.0, a multilingual annotated corpus. The English portion includes

Task: Participants built systems that output coreference clusters for all mentions in the test documents. Submissions

Impact: CoNLL-2012 established a large, multi-genre benchmark for English coreference resolution and facilitated rigorous, cross-system evaluation.

texts
from
multiple
genres
such
as
newswire,
broadcast
news,
telephone
conversations,
and
web
data.
Each
document
contains
mentions
annotated
with
their
referents
and
organized
into
coreference
chains.
The
dataset
was
divided
into
training,
development,
and
test
sets,
with
gold
annotations
provided
for
training
and
development
but
withheld
for
the
test
set
to
enable
evaluation
of
submitted
systems.
were
evaluated
using
established
coreference
metrics,
notably
MUC,
B3,
and
CEAF.
The
official
CoNLL
score
is
typically
reported
as
the
average
of
these
three
metrics,
providing
a
single
measure
to
compare
approaches.
It
contributed
to
methodological
advances
in
the
field
and
remains
a
widely
cited
reference
for
research
in
coreference
and
related
discourse
understanding
tasks.