AI didn't appear overnight. It has a fascinating 70-year history of big dreams, long winters, and sudden breakthroughs. Understanding where AI came from helps make sense of where it's going — and why today's systems are so different from what came before.
The birth of an idea — 1940s & 50s
The story begins not with computers, but with a question. In 1950, British mathematician Alan Turing asked: "Can machines think?" He proposed what became known as the Turing Test — if a machine can converse so naturally that a human can't tell it apart from another human, it might be considered intelligent.
Then in 1956, a landmark summer workshop at Dartmouth College coined the term "Artificial Intelligence" for the first time. The founding fathers — John McCarthy, Marvin Minsky, Claude Shannon, and others — were optimistic. They believed human-level AI was perhaps 20 years away.
Alan Turing's 1950 test: if a human judge can't distinguish between a machine and a human through text conversation, the machine passes. Modern LLMs like ChatGPT and Claude pass this test routinely — something unthinkable even 10 years ago.
A brief timeline of AI
Turing's question
Alan Turing publishes "Computing Machinery and Intelligence", proposing the Turing Test and laying the philosophical groundwork for AI.
The Dartmouth workshop
The term "Artificial Intelligence" is coined. Researchers believe human-level AI is decades away — optimism runs high.
First AI Winter
Progress stalls. Computers are far too slow and there's not enough data. Funding dries up, enthusiasm fades. The field enters its first "winter".
Expert systems
AI revives with "expert systems" — programs that encode human expertise as rules. Used in medicine and finance, but brittle and expensive to maintain.
Machine learning emerges
The focus shifts from hand-coded rules to learning from data. IBM's Deep Blue beats chess world champion Garry Kasparov in 1997 — a cultural milestone.
The deep learning revolution
A neural network called AlexNet dramatically outperforms everything else at image recognition. Deep learning — powered by GPUs and big data — changes everything.
The Transformer architecture
Google researchers publish "Attention is All You Need", introducing the Transformer — the architecture behind every modern LLM including GPT, Claude, and Gemini.
ChatGPT changes everything
OpenAI releases ChatGPT. It reaches 100 million users in 2 months — the fastest consumer product adoption in history. The generative AI era begins.
Agentic AI & multimodal
AI moves from answering questions to taking actions. Models can see, hear, speak, write code, and operate computers autonomously.
The concept of AI Winters
The history of AI is not a straight line upward. It has gone through several "winters" — periods when progress plateaued, funding collapsed, and the field fell out of favour. These winters happened because early promises outpaced what was technically possible.
What makes today different? Three things came together around 2010: massive datasets (the internet), powerful hardware (GPUs originally built for gaming), and better algorithms (deep learning). These three forces converged to create the AI we know today.
Narrow AI vs General AI
Every AI system we have today is Narrow AI (also called Weak AI) — meaning it's designed for a specific task. GPT-4 is extraordinary at language, but it can't drive your car. AlphaFold predicts protein structures brilliantly, but it can't write an email.
Artificial General Intelligence (AGI) — a system that can do anything a human can do, across any domain — remains a goal that researchers debate intensely. Some believe it's decades away. Others believe it could arrive sooner. No one knows for certain.
| Narrow AI (today) | General AI (hypothetical) |
|---|---|
| Excels at one specific task | Can perform any intellectual task |
| Trained on specific data for a specific purpose | Would learn and adapt across any domain |
| Examples: ChatGPT, DALL·E, AlphaFold | No confirmed examples exist yet |
| Widely available today | Timeline uncertain — years to decades |
The three waves of AI
A useful way to think about AI's evolution is in three waves:
- Rule-based AI (1950s–1980s) — humans encode knowledge as explicit rules. Powerful in narrow domains, but brittle. Couldn't handle anything outside its rules.
- Statistical / Machine Learning AI (1990s–2010s) — instead of rules, systems learn patterns from data. Far more flexible, but required careful feature engineering by humans.
- Deep Learning / Foundation Models (2012–present) — neural networks learn everything end-to-end from raw data. Extraordinarily powerful — capable of language, vision, reasoning, and creativity at scale.
Key takeaways
- AI as a formal field began in 1956 — it's been around for nearly 70 years
- The field has gone through cycles of hype and "winters" when progress stalled
- The modern AI era was triggered by deep learning, big data, and GPU compute converging around 2012
- All AI today is Narrow AI — designed for specific tasks, not general intelligence
- The Transformer architecture (2017) underpins every major language model today