Home

Vocabularies

Vocabularies are the inventories of words that make up the lexicon of a language or of an individual speaker. In linguistic usage, the term lexicon is often used to denote the full set of lexical units—words and fixed expressions—that a person recognizes and can use. A distinction is commonly made between active (productive) vocabulary, consisting of words a person readily uses in speech and writing, and receptive (passive) vocabulary, consisting of words recognized and understood in listening or reading but not necessarily produced.

For a language community, vocabulary size and structure reflect its history, domain specialization, and ongoing change.

Vocabularies can be organized in different ways. Alphabetical lists are common in dictionaries, while semantic-field groupings

In computational linguistics and natural language processing, a model’s vocabulary is the set of tokens it

Individual
vocabularies
vary
widely
by
age,
education,
occupation,
and
exposure.
Estimates
for
adult
native
speakers
place
active
vocabularies
roughly
in
the
range
of
a
few
tens
of
thousands
of
words,
with
passive
vocabularies
larger
still;
counts
are
highly
language-specific
and
method-dependent.
Vocabulary
development
occurs
throughout
life,
driven
by
reading,
listening,
schooling,
and
deliberate
study,
and
is
shaped
by
semantic
relationships,
word
formation,
and
frequency
of
use.
Multilingual
speakers
maintain
separate
vocabularies
for
each
language
and
may
share
or
transfer
knowledge
between
them.
or
word-family
relations
(derivations,
cognates)
aid
learning
and
analysis.
In
lexicography,
entries
are
defined
by
lemmas
and
expanded
with
inflected
forms
and
multiword
expressions.
can
recognize
or
generate.
Vocabulary
size
affects
coverage
and
handling
of
unseen
words,
leading
to
techniques
such
as
subword
modeling
(byte-pair
encoding,
WordPiece,
SentencePiece)
to
manage
rare
or
agglutinated
forms.