Home

FULLTEXT

Full text, in information retrieval, refers to the complete textual content of a document. It contrasts with metadata, abstracts, titles, or structured fields. A full-text search system builds an index over the content to support queries that search the words contained in the documents themselves.

In typical implementations, the text is tokenized, lowercased, and normalized; common words (stop words) may be

Full-text search is provided by search engines, document management systems, and database systems with native full-text

Benefits include improved recall for keyword-based queries and the ability to search entire documents. Limitations include

removed;
terms
may
be
stemmed
or
lemmatized.
The
core
data
structure
is
an
inverted
index
that
maps
terms
to
the
documents
containing
them.
When
a
user
submits
a
query,
the
system
retrieves
candidate
documents
and
ranks
them
using
a
scoring
function
such
as
TF-IDF
or
BM25,
possibly
incorporating
term
frequency,
document
frequency,
and
field
weights.
indexing.
Examples
include
search
features
in
web
search
engines;
PostgreSQL's
tsquery/tsvector,
MySQL/MariaDB
FULLTEXT
indexes,
and
SQLite
FTS
modules.
Some
systems
support
phrase
queries,
proximity
searches,
boolean
operators,
and
ranking
controls,
as
well
as
support
for
multiple
languages
and
stemming.
potential
noise
from
common
terms,
the
need
for
linguistic
processing,
update
and
storage
costs,
and
imperfections
in
ranking.
Quality
depends
on
indexing
choices,
language
support,
and
the
handling
of
synonyms,
stop
words,
and
document
length.