GPT1
GPT-1, short for Generative Pre-trained Transformer 1, is a language model developed by OpenAI and released in 2018. It introduced the idea of pre-training a single large neural network on unlabeled text and then fine-tuning it on downstream tasks, a form of transfer learning for natural language processing.
Architecturally, GPT-1 uses a transformer decoder with 12 layers, a hidden size of 768, and 12 attention
Evaluation showed that a single pre-trained model could be adapted to a variety of NLP tasks, including
GPT-1 laid the groundwork for the GPT series and contributed to the broader adoption of large pre-trained