2020Model

GPT-3 Launches

OpenAI released GPT-3 with 175 billion parameters, demonstrating remarkable few-shot learning abilities across a wide range of tasks without task-specific fine-tuning. Users could prompt the model with just a few examples and get high-quality outputs for translation, code generation, creative writing, and more. GPT-3 ignited a wave of AI startups and applications built on large language models.

In June 2020, OpenAI published the paper for GPT-3 and began offering API access to the model, which had 175 billion parameters -- over 100 times larger than GPT-2. GPT-3 did not merely improve on its predecessor; it demonstrated qualitatively new capabilities that stunned the AI research community and captured the imagination of developers, entrepreneurs, and the general public.

The Scale

GPT-3's 175 billion parameters made it the largest language model ever trained at the time. Training required an estimated 3.14 x 10^23 floating point operations and cost several million dollars in compute. The training dataset included a filtered version of Common Crawl, WebText2, Books1, Books2, and English Wikipedia -- approximately 570 GB of text after filtering. The model's sheer size enabled capabilities that smaller models simply could not match.

Few-Shot Learning

GPT-3's most remarkable capability was few-shot learning. By providing just a few examples of a task in the prompt -- without any gradient updates or fine-tuning -- GPT-3 could perform tasks it was never explicitly trained for. Show it three examples of English-to-French translation, and it could translate a fourth. Show it examples of a particular writing style, and it could generate more. This "in-context learning" was a paradigm shift from the fine-tuning approach that had defined NLP.

The API and Developer Ecosystem

Unlike previous GPT models, GPT-3 was released primarily as a commercial API rather than an open-source model. This decision allowed OpenAI to control access and generate revenue, but it also democratized access to powerful AI -- any developer could integrate GPT-3 into their applications without needing massive computational resources. Within months, hundreds of applications were built on the API, from content writing tools to code generators to creative assistants.

Surprising Capabilities

Developers discovered capabilities that even OpenAI had not anticipated. GPT-3 could write functioning code from natural language descriptions. It could generate legal contracts, marketing copy, and poetry. It could engage in philosophical discussions, explain complex topics, and even perform basic arithmetic (though unreliably). It could mimic specific writing styles and generate text in various formats. Each new application demonstrated the breadth of knowledge captured in the model's parameters.

Limitations

GPT-3 also had significant limitations. It could generate plausible-sounding but factually incorrect text -- a problem later termed "hallucination." It had no persistent memory between conversations. It could produce biased or offensive content. It struggled with logical reasoning, multi-step math, and tasks requiring true understanding rather than pattern matching. These limitations would drive much of the research agenda for subsequent models.

The Startup Wave

GPT-3 ignited an explosion of AI startups. Companies like Jasper (content writing), Copy.ai (marketing copy), and dozens of others built businesses on top of the GPT-3 API. The model demonstrated that large language models could be the foundation for practical, valuable products, attracting billions of dollars in venture capital funding to the AI sector.

Impact on AI Research

GPT-3 provided strong evidence for the "scaling hypothesis" -- the idea that many capabilities emerge naturally as models get larger. This influenced research priorities across the entire field, with organizations racing to train ever-larger models. It also demonstrated that language models could serve as general-purpose AI tools, capable of being steered to perform diverse tasks through careful prompting rather than specialized training.

Key Figures

Tom BrownBenjamin MannNick RyderDario AmodeiSam AltmanIlya Sutskever

Lasting Impact

GPT-3 proved that scaling language models could produce qualitatively new capabilities like few-shot learning, igniting an industry-wide race to build larger models. It catalyzed a wave of AI startups and established the API-based business model for commercial AI.

Related Events

2019Model

GPT-2 Released

OpenAI initially withheld GPT-2, citing concerns that its 1.5 billion parameter model could be misused to generate convincing fake text at scale. The decision sparked widespread debate about responsible AI disclosure and the dual-use nature of powerful language models. GPT-2 was eventually released in stages, and its text generation quality surprised many researchers.

2023Model

GPT-4 Launches

OpenAI released GPT-4, a multimodal model capable of processing both text and images with significantly improved reasoning abilities compared to its predecessors. It scored in the top percentiles on professional exams including the bar exam and medical licensing tests. GPT-4 set a new benchmark for what large language models could achieve.

2022Milestone

ChatGPT Launches

OpenAI released ChatGPT on November 30, 2022, and it became the fastest-growing consumer application in history, reaching 100 million users in just two months. Built on GPT-3.5 with reinforcement learning from human feedback, it made conversational AI accessible to the general public. ChatGPT fundamentally shifted public perception of AI capabilities and triggered an industry-wide race.