Home

simt

SIMT stands for Single Instruction, Multiple Threads. It is a parallel execution model used by many modern graphics processing units, most notably NVIDIA’s CUDA-enabled devices. Under SIMT, groups of threads—commonly called warps—execute the same instruction across multiple data elements. The hardware schedules warps on a set of execution units, enabling thousands of threads to participate in data-parallel computations.

Unlike traditional SIMD, SIMT allows threads within a warp to follow different control paths. When a branch

In practice, a warp often consists of a fixed number of threads (for example, 32 in many

SIMT is a core concept in GPU programming models like CUDA and underpins scalable data-parallel computation.

occurs,
some
threads
may
take
the
true
path
while
others
take
the
false
path.
The
hardware
uses
masking
to
disable
lanes
not
on
a
given
path
and
re-enables
them
when
threads
reconverge.
This
divergence
can
reduce
efficiency,
because
differing
paths
cannot
be
executed
identically
in
parallel
and
may
require
serialized
handling
within
a
warp.
NVIDIA
GPUs),
while
other
architectures
use
different
sizes
such
as
64-thread
wavefronts.
The
scheduler
interleaves
many
warps
to
hide
memory
latency
and
keep
execution
units
busy.
Performance
is
strongly
influenced
by
memory
access
patterns;
coalesced
global
memory
accesses,
effective
use
of
shared
memory,
and
caching
are
important
for
throughput
and
latency
hiding.
Occupancy—the
number
of
active
warps
per
streaming
multiprocessor—also
affects
performance,
but
is
balanced
against
resource
limits
per
multiprocessor.
It
is
well
suited
to
workloads
such
as
linear
algebra,
image
processing,
and
simulations,
where
many
threads
operate
on
large
data
sets
concurrently.