Home

pgtrgm

pg_trgm is a PostgreSQL extension that provides trigram-based text search capabilities and indexing. It generates trigrams—substrings of length three—from text and uses them to approximate string similarity. This enables efficient fuzzy matching, approximate string comparisons, and near-duplicate detection, particularly for pattern-based queries such as LIKE, ILIKE, and SIMILAR TO.

The extension offers support for index-based search using both GiST and GIN index types. The trigram operators

Usage typically begins with enabling the extension and creating trigram-based indexes. Example: CREATE EXTENSION pg_trgm; CREATE

Considerations include increased storage and maintenance costs for trigram indexes and the need to tune the

and
index
access
methods
(gin_trgm_ops
and
gist_trgm_ops)
accelerate
queries
that
involve
substring
matching
and
similarity
checks.
In
addition
to
indexing,
pg_trgm
includes
functions
such
as
similarity(text,
text)
and
word_similarity(text,
text),
which
compute
similarity
at
the
string
and
word
levels,
respectively.
It
also
provides
a
distance
operator
(for
example,
the
<->
operator)
to
order
results
by
how
close
strings
are.
For
inspection
and
debugging,
show_trgm(text)
can
display
the
set
of
trigrams
for
a
given
string,
and
set_limit(float)
adjusts
the
global
similarity
threshold
used
by
the
%
operator.
INDEX
ON
my_table
USING
gin
(my_text_column
gin_trgm_ops);
Such
indexes
speed
up
queries
like
SELECT
*
FROM
my_table
WHERE
my_text_column
LIKE
'%foo%';
or
WHERE
my_text_column
ILIKE
'%bar%'.
You
can
also
compare
similarity
directly,
e.g.,
SELECT
*
FROM
my_table
WHERE
similarity(my_text_column,
'pattern')
>
0.4,
or
ORDER
BY
my_text_column
<->
'pattern'
ASC
to
sort
by
closeness.
similarity
threshold
to
balance
precision
and
recall.
pg_trgm
is
widely
used
for
flexible
text
matching
and
disambiguation
tasks
within
PostgreSQL.