weightsharing
Weight sharing is a design principle in neural networks in which the same parameters are reused across different parts of the model or across different positions in the input. The approach contrasts with using a separate set of parameters for each location, time step, or component. The primary goals are to reduce the number of learnable parameters, improve data efficiency, and promote invariances or shared structure that can generalize across locations or steps.
Common forms of weight sharing include:
- Convolutional neural networks, where a single kernel is applied across all spatial positions in a feature
- Recurrent neural networks, where the same hidden-to-hidden and input-to-hidden weights are used at every time step,
- Tied or shared weights in autoencoders and language models, where encoder and decoder weights or multiple
- Siamese networks, which use identical subnetworks with shared weights to compare inputs.
- Graph neural networks, where a common update function with shared parameters is applied across nodes and
Some architectures also share parameters across layers, a practice seen in models that aim to reduce size