Home

Realwords

Realwords is a term used in linguistics and language technology to denote strings that have established lexical status in a language’s standard resources, such as dictionaries and large corpora. A Realword is typically a word with a defined meaning, conventional pronunciation, and a recognized part of speech within the language community. The concept covers base forms as well as inflected or derived forms that are attested in usage, such as run, running, and runner.

Realwords are contrasted with non-words or pseudo-words, which are pronounceable strings lacking recognized meaning or lexical

Definitions of what counts as a Realword can vary by language, region, and dictionary edition. Some terms

See also:

- Lexicography

- Lexicon

- Pseudo-word

- Neologism

- Language model

- Corpus linguistics

status.
In
research
and
applications,
Realwords
are
often
distinguished
from
nonce
words
or
invented
terms
used
in
experiments
or
training
data.
In
natural
language
processing,
identifying
Realwords
aids
tasks
such
as
spell
checking,
lexical
disambiguation,
and
the
evaluation
of
language
models,
where
authentic
vocabulary
is
expected
to
appear.
The
status
of
a
Realword
can
influence
how
a
system
handles
morphology,
syntax,
and
repair
in
text
generation
or
analysis.
may
be
dialectal,
archaic,
or
technical,
and
their
status
can
change
over
time
as
language
evolves.
Brand
names
and
certain
proper
nouns
may
receive
lexical
status
in
dictionaries
but
are
treated
differently
in
some
computational
tasks,
depending
on
context
and
application.