Home

baseform

Base form, often called the lemma or dictionary form, is the canonical form of a word from which its inflected forms are derived. It serves as a representative form for a lexeme in linguistic analysis and dictionary entries. The base form is not always a single fixed surface form across languages, but it is the form used as the reference point for inflection, syntax, and meaning.

In English, the base form of a verb is the form used to express the verb without

In computational linguistics, the base form is central to lemmatization, the process of converting a word form

Limitations arise from irregular forms and polysemy. Some languages have multiple valid lemmas for a single

tense,
aspect,
or
agreement
inflection,
and
it
is
typically
the
form
found
in
dictionaries.
Examples
include
go,
speak,
and
eat.
For
nouns,
the
base
form
is
usually
the
singular
form
that
appears
in
dictionaries,
such
as
cat
rather
than
cats.
In
languages
with
rich
morphology,
the
base
form
acts
as
the
starting
point
for
generating
all
inflected
variants.
to
its
base
form
or
lemma.
This
contrasts
with
stemming,
which
trims
forms
without
guaranteeing
a
valid
word.
Lemmatization
supports
tasks
like
information
retrieval,
search
indexing,
and
text
analysis
by
normalizing
different
inflections
to
a
single
representative
form.
For
example,
went
becomes
go,
and
cats
become
cat.
surface
form,
and
context
may
be
required
to
select
the
appropriate
base
form.
Nevertheless,
the
concept
of
a
base
form
is
a
foundational
tool
in
linguistics,
lexicography,
and
natural
language
processing
for
organizing
and
analyzing
word
forms.