Home

gemm

GEMM, short for General Matrix Multiply, is a fundamental routine in linear algebra and a core operation in the BLAS (Basic Linear Algebra Subprograms) library. It computes a matrix-matrix product with scaling and accumulation, according to the formula C = alpha * op(A) * op(B) + beta * C, where A, B, and C are matrices, alpha and beta are scalars, and op(X) denotes X, its transpose, or its conjugate transpose for complex matrices.

Dimensions and variants: If op(A) is m-by-k and op(B) is k-by-n, then C must be m-by-n. For

Performance characteristics: GEMM performs roughly 2*m*n*k floating-point operations, making it highly compute-bound on modern hardware. Efficient

Implementations and usage: GEMM is provided by BLAS-compatible libraries such as OpenBLAS, ATLAS, Intel MKL, cuBLAS,

Summary: As a standard, high-performance primitive, GEMM underpins much of numerical computation, enabling efficient large-scale matrix

real
matrices,
op(A)
and
op(B)
are
typically
A
or
A^T,
while
for
complex
matrices
they
may
also
be
conjugate-transposed
(A^H
and
B^H).
The
operation
thus
covers
multiple
variants
depending
on
transposition
flags
and
data
types.
implementations
emphasize
cache-friendly
blocking
(tiling),
SIMD
vectorization,
and
parallel
execution.
Numerous
architecture-specific
micro-kernels
optimize
for
peak
performance.
and
Apple
Accelerate.
It
serves
as
a
building
block
for
a
wide
range
of
linear
algebra
routines
and
is
central
to
algorithms
in
scientific
computing,
graphics,
and
machine
learning.
Different
data
types
have
specialized
routines,
such
as
sgemm/dgemm
for
real
single/double
precision
and
cgemm/zgemm
for
complex
data.
multiplications
across
diverse
applications
and
hardware
platforms.