Home

Slurm

Slurm is an open-source workload manager designed for Linux computer clusters, used to allocate resources, schedule jobs, and monitor cluster usage in high-performance computing environments. It emphasizes scalability, reliability, and a modular architecture suitable for large-scale systems.

The Slurm architecture centers on a controller and daemons: a central controller process (slurmctld) runs on

Scheduling and resource management features include partitions, which group nodes for different workloads, and information about

Accounting and monitoring capabilities enable tracking of usage and performance. Slurm can store accounting data in

History and licensing: Slurm originated at Lawrence Livermore National Laboratory in the 1990s as Simple Linux

the
management
node,
while
compute
nodes
run
the
slurmd
daemon.
An
optional
accounting
daemon
(slurmdbd)
can
store
usage
data
in
a
relational
database.
Configuration
is
provided
by
the
slurm.conf
file,
which
defines
partitions,
nodes,
and
scheduling
policies.
Authentication
typically
uses
Munge;
LDAP
and
other
methods
can
be
integrated
as
needed.
nodes
such
as
state
and
features.
Jobs
are
submitted
with
sbatch
or
srun
and
can
be
organized
into
job
arrays,
have
dependencies,
or
be
restricted
by
quality
of
service
(QoS)
and
reservations.
Slurm
supports
backfill
scheduling
and,
in
some
configurations,
preemption
to
improve
resource
utilization
and
meet
priority
requirements.
a
relational
database
via
SlurmDBD,
with
common
backends
including
MySQL,
MariaDB,
or
PostgreSQL.
User-facing
and
administration
tools
include
sacct,
sacctmgr,
sreport,
squeue,
scontrol,
and
sprio,
which
provide
status,
accounting,
and
priority
information
as
well
as
cluster
control.
Utility
for
Resource
Management
and
was
later
renamed
Slurm.
It
is
released
under
the
GNU
General
Public
License,
with
community
and
vendor
support
continuing
to
evolve.
Slurm
is
widely
deployed
in
research
and
industry
HPC
environments
and
maintained
by
a
broad
developer
community.