Home

InfoGAN

InfoGAN is a variant of Generative Adversarial Networks designed to learn interpretable and disentangled representations without relying on labeled data. It was introduced by Xi Chen, Yan Duan, and colleagues in 2016 as an extension to GANs that explicitly maximizes the mutual information between a subset of latent variables and the generated observations.

In InfoGAN, the latent vector is split into a noise variable z and an information-carrying code c.

Training optimizes the standard GAN objective plus a term that maximizes a lower bound on I(c; G(z,

InfoGAN has been demonstrated on datasets such as MNIST and SVHN, showing that varying c yields predictable

The
code
c
can
be
discrete,
continuous,
or
a
mix,
and
is
intended
to
capture
meaningful
factors
of
variation
in
the
data,
such
as
digit
identity,
rotation,
or
stroke
thickness
in
images.
A
separate
neural
network
Q
is
added
to
approximate
the
posterior
P(c
|
x),
enabling
a
tractable
variational
bound
on
mutual
information.
c)).
Specifically,
the
objective
includes
a
mutual
information
term
that
is
estimated
using
the
Q
network,
forming
a
combined
loss:
LGAN
+
lambda
*
LInfo,
where
LInfo
is
a
variational
lower
bound
of
the
mutual
information.
This
encourages
the
generator
to
produce
images
whose
variation
is
controllable
by
c,
leading
to
more
interpretable
and
disentangled
representations.
and
interpretable
changes
in
the
generated
images.
While
it
improves
interpretability,
the
degree
of
disentanglement
depends
on
the
choice
of
c
and
training
dynamics,
and
challenges
in
balancing
the
information
term
with
the
adversarial
objective
can
arise.