Home

regex

A regular expression, or regex, is a compact pattern used to search, match, and manipulate strings. In theory, regexes describe regular languages and relate to finite automata; in practice, software engines interpret patterns to scan text.

Origin: The concept derives from formal language theory (Kleene, 1950s). Regex-like syntax appeared in early text

Syntax overview: Literals match exact characters; metacharacters define patterns. Common features include character classes [abc], quantifiers

Usage and limitations: Regexes are used for validation, extraction, substitution, and splitting. They allow concise text

Examples: [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} matches a simple email-like address. \b\d{2}/\d{2}/\d{4}\b matches a date. Exact syntax varies between tools

tools,
and
modern
languages
and
utilities
extend
it
with
varying
features
and
dialects.
*,
+,
?,
and
{m,n},
anchors
^
and
$,
grouping
(),
and
alternation
|.
Escapes
with
backslash
allow
literal
metacharacters.
Lookahead
and
backreferences
exist
in
many
engines
but
are
not
universal.
processing
but
can
be
slow
if
poorly
designed;
some
patterns
cause
catastrophic
backtracking.
Most
regex
theories
describe
regular
languages,
but
many
engines
add
extensions
that
make
matching
non-regular
and
potentially
more
powerful.
such
as
PCRE,
Python,
JavaScript,
Java,
and
.NET.