Home

harmonicpercussive

Harmonicpercussive, in the context of audio signal processing and music information retrieval, refers to the separation of an audio signal into two complementary components: harmonic content and percussive content. The harmonic component consists of sustained, pitched elements such as notes from melodies or harmonies, while the percussive component comprises transient, broadband events like drums or plucked string attacks. When used as a formal technique, it is often described as harmonic-percussive source separation (HPSS).

Most HPSS methods operate on the magnitude of the short-time Fourier transform (STFT) of an audio signal.

Applications of harmonicpercussive separation include music remixing, source separation for improved transcription or alignment, denoising, and

The
standard
approach
estimates
two
spectrogram
components:
a
harmonic
estimate
that
emphasizes
energy
persistent
over
time,
and
a
percussive
estimate
that
emphasizes
energy
spread
across
frequencies
in
short
time
frames.
The
two
estimates
are
combined
with
soft
masks
to
separate
the
signal
into
harmonic
and
percussive
spectrograms,
which
are
then
converted
back
to
time-domain
signals
using
the
original
phase
information.
Variants
may
apply
median
filtering
along
time
or
frequency
axes
to
isolate
the
respective
components.
feature
extraction
in
music
information
retrieval.
It
is
commonly
implemented
in
audio
analysis
libraries
and
software,
with
both
traditional
DSP-based
methods
and
modern
deep
learning
approaches
that
predict
masks
or
directly
estimate
component
spectra.
Limitations
arise
when
harmonic
and
percussive
elements
overlap
in
time
and
frequency,
leading
to
leakage
or
artifacts.
HPSS
remains
a
useful
preprocessing
step
but
is
seldom
perfect
for
complex
or
highly
dense
mix
content.
See
also
source
separation
and
music
information
retrieval.