Home

Hyperparameter

A hyperparameter is a configuration value external to a machine learning model that must be set before training and is not learned from the training data by standard optimization. Hyperparameters differ from model parameters, such as weights and biases, which are learned during training. Hyperparameters influence the learning process, including the capacity of the model, the speed of convergence, and the strength of regularization.

Examples include the learning rate used by optimization algorithms, regularization strength, the number of layers and

Hyperparameter tuning aims to identify values that maximize performance on a validation dataset. Common strategies include

The choice of hyperparameters can substantially affect model performance and should be reported to support reproducibility.

units
in
a
neural
network,
the
choice
of
optimizer,
batch
size,
dropout
rate,
and
the
number
of
trees
in
ensemble
methods.
In
decision
trees
or
boosting,
the
maximum
depth
or
the
learning
rate
acts
as
hyperparameters.
Some
algorithms
also
expose
controls
such
as
early
stopping
criteria
or
data
augmentation
settings.
grid
search,
which
exhaustively
tests
combinations;
random
search,
which
samples
combinations
randomly;
and
Bayesian
optimization,
which
builds
a
probabilistic
model
to
guide
exploration.
Other
approaches
include
Hyperband
and
population-based
training,
as
well
as
nested
cross-validation
to
assess
tuning
stability.
Tuning
incurs
computational
cost
and
can
lead
to
overfitting
to
the
validation
set
if
performed
excessively.
In
practice,
practitioners
start
with
reasonable
defaults
and
use
systematic
search
constrained
by
available
resources.