Home

GATs

Graph Attention Networks (GATs) are a class of neural networks designed to operate on graph-structured data by applying attention mechanisms to the neighborhood aggregation process. They assign learned, dynamic weights to neighboring nodes when computing a node’s representation, rather than following a uniform average as in some earlier graph convolutional methods. This allows the model to focus on the most relevant neighbors for each node and to adapt to heterogeneous neighborhood sizes.

In a single attention head, the model first applies a shared linear transformation to the input features

GATs support inductive learning on unseen nodes and can scale to graphs without requiring eigenvector computations,

Key considerations when using GATs include computational cost on large graphs, the choice of the number of

of
all
nodes.
For
each
edge
between
a
node
i
and
its
neighbor
j,
an
attention
coefficient
eij
is
computed
using
a
small
feed-forward
network
on
the
transformed
features,
typically
followed
by
a
nonlinearity
such
as
LeakyReLU.
The
coefficients
for
node
i
are
then
normalized
across
its
neighbors
with
a
softmax,
producing
αij
that
sum
to
one.
The
updated
feature
for
node
i
is
a
weighted
sum
of
the
transformed
neighbor
features,
using
αij
as
weights.
This
process
is
repeated
for
multiple
attention
heads,
and
their
outputs
are
either
concatenated
or
averaged
to
form
the
next
layer’s
representation.
making
them
a
flexible
alternative
to
spectral
methods.
They
have
been
shown
to
perform
well
on
standard
semi-supervised
node
classification
tasks,
such
as
citation
networks,
and
have
inspired
numerous
extensions
and
variants,
including
deeper
architectures
and
alternative
attention
mechanisms.
attention
heads,
and
how
to
integrate
multi-head
outputs.
While
effective
in
many
settings,
their
performance
depends
on
data
quality
and
appropriate
hyperparameter
tuning.