We’re not just coding anymore — we’re prompting, orchestrating, and building alongside AI.
— Andrej Karpathy

Andrej Karpathy, former Director of AI at Tesla and founding member of OpenAI, recently gave a powerful talk titled “Software in the Era of AI.” While it’s worth watching in full (link below), here’s a distilled, structured guide to the core ideas and takeaways — especially for ML engineers building at the frontier.

🎥 Watch the full talk here → https://www.youtube.com/watch?v=LCEmiRjPEtQ

🛡 The Three Generations of Software

Karpathy introduces a new framework to understand how software is evolving:

🔹 Software 1.0 — Code as Instructions

🔹 Software 2.0 — Neural Nets as Programs

🔹 Software 3.0 — Prompts as Programs

Karpathy’s framing of Software 1.0 (manual code), 2.0 (neural networks), and 3.0 (prompting LLMs in natural language).

💡 Prompting is Programming

In the Software 3.0 world, your prompt becomes the program.

Example:
To classify sentiment, you can:

This shift is not just about convenience — it’s a new computing paradigm.

💻 LLMs as Operating Systems

Karpathy argues that LLMs aren’t just tools — they’re becoming complex software platforms, like operating systems.

Similarities to OS:

LLMs acting as new operating systems — orchestrating memory (context), compute (token inference), and I/O (tool use).

🛠 LLM Apps Are Partially Autonomous Systems

Karpathy emphasizes that the most useful AI applications today aren’t full agents — they’re partially autonomous tools.

Example:

Cursor is an AI-powered code editor:

💡 This creates an “autonomy slider”: control how much work you give the AI.

LLM-native apps like Perplexity combine AI logic with familiar GUI controls to keep humans in the loop.

🧐 LLMs Are “People Spirits”

Karpathy offers a provocative analogy: LLMs are “people spirits” — stochastic simulations of humans with memory, reasoning, and personality.

LLM strengths:

But also weaknesses:

This makes working with LLMs a human-AI cooperation game, where:

Karpathy’s human-AI collaboration loop: AI generates; humans verify — and the faster this loop, the better.

🧰 How to Build Great LLM Apps

According to Karpathy, effective LLM apps share 4 common features:

1. Context management

Apps feed LLMs the right info at the right time (e.g. embeddings of your codebase).

2. Multi-LLM orchestration

Use different models for different jobs (chat, retrieval, diffs).

3. Custom GUI for audit & control

A good interface lets users see what the AI is doing and approve/reject outputs quickly.

4. Autonomy slider

Let users control how much power the AI gets — from autocomplete to repo-wide edits.

Cursor allows developers to slide between manual coding and full AI-driven repo changes — a spectrum of autonomy.

🧠 Design for Speed: Generation + Verification

Karpathy says: we’re no longer just writing software — we’re verifying AI-generated software.

How to speed up the feedback loop:

🌍 Build for LLMs, Not Just Humans

A surprising insight: LLMs are now users of software. Just like humans or APIs.

What this means:

Designing for agents: Simplified, machine-readable documentation helps LLMs understand and interact with your software.

✨ Vibe Coding: Everyone’s a Programmer Now

A viral moment from the talk: Karpathy’s coined term “vibe coding.”

You don’t know Swift? Doesn’t matter.
Prompt the LLM, copy-paste, tweak, repeat.

He built working iOS and web apps without knowing the languages, just by “vibing” with the LLM.

This changes who can build software — and how fast they can do it.

Karpathy’s Menu.app — built by ‘vibe coding’ an AI prototype without knowing Swift. The future of accessible dev.

⚧ DevOps is Now the Bottleneck

Ironically, the hardest part isn’t coding — it’s all the non-code setup:

These tasks are still GUI-based and require human clicks. Karpathy asks:

“Why am I doing this? Let the agents do it!”

Early attempts of addressing this issue is creating the Model Context Protocol (MCP).

The rise of agents

📜 Takeaways for ML Engineers

✅ Learn to work with prompts, not just code
✅ Develop effective apps by combining GUI + autonomy sliders to keep AI on a leash
✅ Structure apps around fast generate–verify loops
✅ Build documentation and UIs that speak to agents