SimSiam
SimSiam, short for Simple Siamese Network for Contrastive Learning, is a self-supervised representation learning method for images introduced in 2020. It shows that a straightforward Siamese network can learn useful visual representations without relying on negative pairs, large batch sizes, or momentum encoders.
In training, two stochastic augmentations of the same image are created and passed through a shared network
Typical architecture components include a convolutional encoder (often a ResNet), a small multilayer perceptron projection head,
Training relies on standard data augmentations (e.g., random cropping, color jitter, flipping) and commonly uses stochastic
SimSiam has influenced the design of subsequent self-supervised methods by demonstrating that strong representations can be