Home

Coreference

Coreference is the linguistic relation in which multiple expressions in a discourse refer to the same real-world entity. Expressions can be proper names, common nouns, or pronouns, and a coreference chain groups all mentions that refer to the same entity within a text. Coreference is essential for coherent interpretation, discourse structure, and information integration.

Anaphora is a common type of coreference where a later expression refers back to an earlier antecedent,

Coreference resolution (also called anaphora resolution) is the task of identifying antecedents for all mentions and

In practice, coreference resolution supports many natural language processing applications, including machine translation, document summarization, question

as
in
"Alice
arrived
late
because
she
was
stuck
in
traffic,"
where
"she"
refers
to
"Alice."
Cataphora
is
the
opposite,
where
an
expression
refers
forward
to
a
later
antecedent,
such
as
"When
he
spoke,
John
smiled,"
where
"he"
points
to
"John."
forming
coherent
clusters
of
co-referring
expressions.
Approaches
range
from
rule-based
and
statistical
methods
to
modern
neural
models
that
learn
to
form
clusters
of
spans
without
explicit
mention-pair
rules.
End-to-end
neural
models
compute
representations
for
candidate
mentions
and
decide
which
mentions
belong
together
in
the
same
entity.
answering,
and
information
extraction.
Evaluation
typically
uses
benchmark
datasets
and
metrics
such
as
MUC,
B3,
CEAF,
and
related
CO
NLL
shared-task
scores.
Key
challenges
include
resolving
long-distance
dependencies,
pronoun
ambiguity,
zero
pronouns
in
pro-drop
languages,
and
the
need
for
world
knowledge
and
discourse
context
to
disambiguate
references.