Home

AlexNet

AlexNet is a deep convolutional neural network designed for image classification. It was proposed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton in 2012 and achieved a breakthrough on the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), substantially surpassing the prior state of the art and helping to popularize deep learning for computer vision.

The architecture consists of eight learned layers: five convolutional layers followed by three fully connected layers.

For training, AlexNet was split across two GPUs to accommodate the model size and speed up computation.

AlexNet’s success demonstrated the viability of deep CNNs for large-scale visual recognition, sparked widespread adoption of

The
network
uses
the
ReLU
activation
function,
max
pooling,
and
local
response
normalization.
The
first
convolutional
layer
employs
11x11
filters
with
stride
4
and
96
feature
maps;
the
second
uses
5x5
filters
with
256
maps;
the
remaining
three
convolutional
layers
use
3x3
filters
with
384,
384,
and
256
maps
respectively.
Max
pooling
is
applied
after
the
first
and
second
convolutional
layers,
and
local
response
normalization
is
applied
after
the
early
layers.
The
final
three
layers
are
fully
connected,
producing
1000
class
scores.
To
reduce
overfitting,
dropout
was
applied
to
the
first
two
fully
connected
layers
with
a
rate
of
0.5.
Training
used
stochastic
gradient
descent
with
momentum
on
a
large
ImageNet
dataset
containing
over
a
million
images
across
1000
classes.
Data
augmentation
included
random
crops
and
horizontal
flipping.
deep
learning
architectures,
and
influenced
numerous
subsequent
networks
and
training
techniques
in
the
field.