We’re not just coding anymore — we’re prompting, orchestrating, and building alongside AI.
— Andrej Karpathy
Andrej Karpathy, former Director of AI at Tesla and founding member of OpenAI, recently gave a powerful talk titled “Software in the Era of AI.” While it’s worth watching in full (link below), here’s a distilled, structured guide to the core ideas and takeaways — especially for ML engineers building at the frontier.
Open-source models (LLaMA, Mistral) are like Linux.
LLM-native apps like Cursor or Perplexity run on top of this OS layer.
LLMs acting as new operating systems — orchestrating memory (context), compute (token inference), and I/O (tool use).
🛠 LLM Apps Are Partially Autonomous Systems
Karpathy emphasizes that the most useful AI applications today aren’t full agents — they’re partially autonomous tools.
Example:
Cursor is an AI-powered code editor:
You can type manually (human control).
Or you can highlight code and let the AI rewrite it.
Or let it modify an entire repo (full autonomy).
💡 This creates an “autonomy slider”: control how much work you give the AI.
LLM-native apps like Perplexity combine AI logic with familiar GUI controls to keep humans in the loop.
🧐 LLMs Are “People Spirits”
Karpathy offers a provocative analogy: LLMs are “people spirits” — stochastic simulations of humans with memory, reasoning, and personality.
LLM strengths:
Huge general knowledge
Superhuman pattern recognition
But also weaknesses:
Hallucinations
No persistent memory
Easily manipulated (prompt injections)
This makes working with LLMs a human-AI cooperation game, where:
AI generates
Human verifies
Karpathy’s human-AI collaboration loop: AI generates; humans verify — and the faster this loop, the better.
🧰 How to Build Great LLM Apps
According to Karpathy, effective LLM apps share 4 common features:
1. Context management
Apps feed LLMs the right info at the right time (e.g. embeddings of your codebase).
2. Multi-LLM orchestration
Use different models for different jobs (chat, retrieval, diffs).
3. Custom GUI for audit & control
A good interface lets users see what the AI is doing and approve/reject outputs quickly.
4. Autonomy slider
Let users control how much power the AI gets — from autocomplete to repo-wide edits.
Cursor allows developers to slide between manual coding and full AI-driven repo changes — a spectrum of autonomy.
🧠 Design for Speed: Generation + Verification
Karpathy says: we’re no longer just writing software — we’re verifying AI-generated software.
How to speed up the feedback loop:
Use visual GUIs to inspect results (faster than reading raw text).
Write clear, constrained prompts to reduce failures.
Avoid mega-diffs; think in small chunks.
🌍 Build for LLMs, Not Just Humans
A surprising insight: LLMs are now users of software. Just like humans or APIs.
What this means:
Write docs in LLM-readable formats (e.g. markdown, JSON).
Avoid instructions like “click here” — replace with API calls or shell commands.
Add lm.txt files to help LLMs understand your site’s purpose.
Designing for agents: Simplified, machine-readable documentation helps LLMs understand and interact with your software.
✨ Vibe Coding: Everyone’s a Programmer Now
A viral moment from the talk: Karpathy’s coined term “vibe coding.”
You don’t know Swift? Doesn’t matter. Prompt the LLM, copy-paste, tweak, repeat.
He built working iOS and web apps without knowing the languages, just by “vibing” with the LLM.
This changes who can build software — and how fast they can do it.
Karpathy’s Menu.app — built by ‘vibe coding’ an AI prototype without knowing Swift. The future of accessible dev.
⚧ DevOps is Now the Bottleneck
Ironically, the hardest part isn’t coding — it’s all the non-code setup:
Auth
Hosting
Billing
Deployment
These tasks are still GUI-based and require human clicks. Karpathy asks:
“Why am I doing this? Let the agents do it!”
Early attempts of addressing this issue is creating the Model Context Protocol (MCP).
The rise of agents
📜 Takeaways for ML Engineers
✅ Learn to work with prompts, not just code ✅ Develop effective apps by combining GUI + autonomy sliders to keep AI on a leash ✅ Structure apps around fast generate–verify loops ✅ Build documentation and UIs that speak to agents