Home

STONITH

STONITH, an acronym used in high-availability cluster management, stands for a mechanism that forcibly isolates a malfunctioning or unreachable node to protect data integrity. The term is often explained with a tongue-in-cheek expansion, but its practical purpose is to prevent a split-brain scenario where two or more nodes believe they are the active primary and access the same shared resources.

The primary goal of STONITH is to ensure that a failed or partitioned node cannot continue accessing

Common fencing methods include hardware-based power fencing (IPMI, iDRAC, ILO, and smart PDUs) and software or

In some environments, STONITH can be disabled only under tightly controlled conditions, but doing so increases

any
shared
storage
or
resources.
When
a
cluster
detects
a
node
is
unresponsive
or
misbehaving,
it
uses
a
fence
device
or
fence
agent
to
"shoot"
that
node
from
the
cluster,
typically
by
power-cycling
or
severing
network
access.
A
successful
fence
action
should
decisively
render
the
node
harmless
to
the
shared
resources.
Modern
cluster
stacks,
such
as
Pacemaker
with
Corosync,
manage
STONITH
devices
via
fence
agents
and
support
a
range
of
hardware
and
software
fencing
methods.
network-based
fencing
techniques.
Fence
agents,
such
as
fence_ipmilan,
fence_ipmi,
or
vendor-specific
controllers,
implement
the
communication
with
the
fence
device.
STONITH
configurations
should
be
highly
reliable
and
auditable,
with
authentication
and
logging,
and
are
considered
essential
in
many
clusters
to
guarantee
data
integrity.
the
risk
of
data
corruption
in
case
of
failover.
Proper
fencing
is
a
foundational
component
of
robust
high-availability
architectures.