Home

RoCE

RoCE, short for RDMA over Converged Ethernet, is a networking protocol that enables Remote Direct Memory Access over Ethernet networks by carrying InfiniBand RDMA traffic directly in Ethernet frames. It allows memory-to-memory data transfers with very low latency and minimal CPU involvement, effectively offloading data movement from application and kernel software.

RoCE comes in two main variants: RoCEv1 and RoCEv2. RoCEv1 maps RDMA over Ethernet at Layer 2

RoCEv2 extends RoCE by running over UDP/IP, making RDMA routable across Layer 3 networks. This enables inter-subnet

Operation uses standard RDMA verbs (such as queue pairs and completion queues) mapped onto RoCE, enabling zero-copy

Common applications include high-performance storage networks, databases, and HPC clusters, where RoCE provides low latency and

As an alternative for non-lossless Ethernet environments, iWARP offers RDMA over TCP/IP, trading off some latency

and
is
designed
for
lossless
Ethernet
within
a
single
Layer
2
domain.
Because
it
assumes
a
lossless
transport,
it
relies
on
Data
Center
Bridging
features
such
as
Priority-based
Flow
Control
(PFC)
to
prevent
packet
loss,
but
it
is
not
routable
across
IP
subnets
without
specialized
configurations.
connectivity
and
larger-scale
data-center
fabrics
while
retaining
the
RDMA
access
model.
RoCEv2
still
depends
on
a
lossless
link
layer
and
DC
bridging
to
maintain
performance,
with
additional
considerations
for
IP
routing
and
congestion
management.
transfers
between
hosts
after
memory
regions
are
registered.
Network
interface
cards
and
switches
must
support
RDMA,
RoCE,
and
Data
Center
Bridging
(DCB)
features
to
deliver
the
intended
benefits.
high
throughput
with
CPU
offload.
Limitations
include
the
need
for
careful
DC
bridging
configuration,
sensitivity
to
network
congestion,
and
potential
interoperability
challenges
in
mixed
environments
lacking
full
RoCEv2
routing.
and
CPU
efficiency
for
simpler
network
requirements.