A multi-armed engineering droid simultaneously reviewing code, blueprints, and technical schematics — representing agentic engineering workflow

Andrej Karpathy invented the term in February 2025. Twelve months later, he declared it obsolete — and named its replacement himself.


In February 2025, Andrej Karpathy named a pattern developers were already living — and called it vibe coding. The original post described it simply: describe what you want, let the AI build it, paste errors back into the chat until it works. He called it a “shower thoughts throwaway tweet.”

The term exploded. Collins Dictionary named it Word of the Year for 2025.

By February 2026, Karpathy was back. Same platform, different message: “Today, programming via LLM agents is increasingly becoming a default workflow for professionals, except with more oversight and scrutiny.” Vibe coding, he said, was passé. He had a new name: agentic engineering.

This is not a rebranding exercise. The shift is structural — and if you are shipping production code with AI tools, it affects how you work right now.


What Vibe Coding Actually Was

Karpathy’s original definition was precise: “fully give in to the vibes, embrace exponentials, and forget that the code even exists.”

That last part is the key. Vibe coding was never about code quality. It was about removing the blank page. You described the outcome, the AI produced something, you ran it, you iterated. No design phase, no architecture review, no ownership of what the model generated.

For prototypes, this was genuinely useful. A quarter of Y Combinator’s Winter 2025 cohort had codebases over 91% written by AI. Lovable reached a $6.6 billion valuation building tools that let non-technical founders ship working apps in a weekend. The floor for what one person could build moved dramatically.

The problem was never the approach itself. Treating a prototyping workflow as an engineering methodology was.


Where It Broke

In January 2026, a startup called Moltbook launched. The founder wrote zero lines of code — the entire product was AI-generated. Three days after launch, Wiz security researcher Gal Nagli found the production database fully exposed: 1.5 million API authentication tokens, 35,000 email addresses, private messages. The root cause was a single missing configuration — Row Level Security never enabled in Supabase. A basic code review would have caught it in minutes. There was no code review. The developer had never read the code.

This is not an edge case. It is a predictable outcome of delegating code ownership entirely to a model. The same pattern surfaces in infrastructure decisions — database configuration, hosting choices, security defaults — wherever speed replaces scrutiny.

The data backs it. CodeRabbit’s December 2025 analysis of 470 real-world open source pull requests found that AI-generated code contains 1.7x more critical and major defects than human-written code. Security vulnerabilities run 1.5 to 2.74x higher. Performance inefficiencies appear nearly 8x more often. Logic and correctness issues — business logic errors, misconfigurations, unsafe control flow — are 75% more common.

These are not random failures. They follow a pattern.

Vibe coding also introduces what practitioners call context collapse. In a long session of prompt-and-fix cycles, the model’s context window fills with failed iterations, conflicting instructions, and chat history that has nothing to do with the current problem. The model loses track of its own logic. You end up fixing one bug that creates two others — not because the AI is failing, but because the process gives it no stable foundation to work from.

Karpathy put it plainly at Sequoia’s AI Ascent 2026: vibe-coded output tends to be “bloaty, a lot of copy-paste, awkward abstractions that are brittle.” Because the developer never read the code, they have no model of why it breaks.


What Replaced It

Karpathy’s definition of agentic engineering is exact: “the professional discipline of coordinating fallible agents while preserving correctness, security, taste, and maintainability.”

The word fallible is doing most of the work in that sentence. Agentic engineering does not assume the AI gets it right. It builds systems that catch and contain the ways it gets it wrong.

At Sequoia, Karpathy walked through what that actually looks like. The agentic engineer designs specs before touching the AI, supervises plans, inspects diffs, writes tests, creates evaluation loops, manages permissions, isolates worktrees. The role shifts from writing code to designing the conditions under which an agent can write code safely.

Anthropic’s 2026 Agentic Coding Trends Report puts it plainly: most teams experimenting with agentic AI skip the oversight infrastructure. That is where they stall.


Vibe Coding vs. Agentic Engineering

The difference is not which AI tool you use. It is where you retain control.

Starting point Vibe coding: a prompt. Agentic engineering: a written spec with explicit constraints.

Code ownership Agentic engineering: retained by the human. Vibe coding: delegated entirely to the AI.

Debugging Vibe coding: paste the error back into chat. Agentic engineering: read logs, trace agent reasoning, understand the execution path.

Context management Agentic engineering: isolated tasks, scoped context windows. Vibe coding: one long, accumulating session.

Quality gate Vibe coding: does it run? Agentic engineering: does it run correctly, securely, and without breaking in three weeks?

Best for Agentic engineering: production code, complex state, anything you will maintain. Vibe coding: prototypes, throwaway scripts, solo exploration.

The mindset difference is sharper than any of these rows. Vibe coding says: write this feature for me. Agentic engineering says: execute this scoped spec, within these constraints, and I will review the output as rigorously as any human’s pull request.

One treats the AI as an author. The other treats it as a contractor who works fast, makes architectural decisions you will inherit, and gets things wrong in ways that won’t surface until production.


What To Do Differently

Write the spec before you open the chat. Not a vague description — a structured document. What is the input, what is the output, what are the edge cases, what are the constraints. The quality of the agent’s output is directly proportional to the quality of the instructions it receives.

Scope every task to a single concern. Asking an agent to “build the authentication flow” is a vibe coding prompt. Asking it to “implement the password reset endpoint according to this spec, touching only these files, using this error handling pattern” is agentic engineering. Small, isolated tasks keep context windows clean and make output reviewable.

Read the code. Every line. You do not have to write it, but you must understand it. If you cannot explain why the agent made a specific architectural choice, you do not own the codebase — you are renting it from the model until the next session resets its memory.

Add deterministic guardrails. Linters, formatters, type checkers, automated tests — run them on every agent output before it merges. These are checks the agent cannot bypass and cannot hallucinate past. They are the difference between catching a missing security configuration before launch and discovering it three days after.

Review agent output as you would a junior engineer’s pull request. Not rubber-stamped because it runs. Scrutinized because it represents a decision about your system’s architecture and security — one you will be debugging at 2am six months from now.

What not to do: Do not use a vibe coding workflow for anything that touches production data, handles authentication, processes payments, or will be maintained by anyone other than you for longer than a week. The short-term speed is not worth the cleanup cost.


The Line That Matters

At Sequoia AI Ascent 2026, Karpathy’s host opened with this: “Last year, he coined ‘vibe coding’. This year, he’s never felt more behind as a programmer. The big shift: vibe coding raised the floor. Agentic engineering raises the ceiling.”

Karpathy built the floor. He is now working on the ceiling — at Anthropic, on the pretraining team, from May 2026.

Use the floor for prototypes, exploration, throwaway scripts. It gets you from zero to something you can evaluate. That is what it was always for.

Production code lives at the ceiling. Getting there means taking back ownership — of the spec, the review, the architecture, the decision. Your name is on the codebase. What runs in production is your responsibility, regardless of who wrote it.