StrategyNovember 15, 202515 min read

Why Andrej Karpathy Called AI 'Slop'—And What It Really Means for Builders

Why Andrej Karpathy Called AI 'Slop'—And What It Really Means for Builders
Prajwal Paudyal, Phd
SprintUX.ai Team
Share:

A leading AI researcher's controversial claims seemed to pop the hype bubble. But a closer look reveals a practical roadmap for building valuable AI systems today.

Published: 2025-11-15

Summary

A recent podcast interview with Andrej Karpathy, a founding research scientist at OpenAI, sent ripples through the tech world. His assessment that truly useful AI agents are a 'decade away' and his description of their current state as 'slop' were quickly framed as a burst of the AI hype bubble. This article goes beyond the headlines to unpack Karpathy's nuanced critique. He argues that today's agents suffer from fundamental deficits in memory, robustness, and reliability, and that the methods used to train them, like reinforcement learning, are deeply inefficient. However, his analysis is not a declaration of failure but a sober-minded assessment from the research frontier. For builders and practitioners, Karpathy's points highlight a crucial takeaway: the immense value of today's AI is unlocked not by waiting for magical, out-of-the-box intelligence, but by applying disciplined engineering and robust architecture to compensate for its current, predictable limitations.

Key Takeaways

  • Andrej Karpathy's claim that useful agents are a 'decade away' refers to fully autonomous agents that don't require extensive architectural support.
  • Current AI agents lack durable memory, robustness, and reliability, which Karpathy calls 'slop.'
  • Builders can create immense value today by architecting systems that manage memory and ensure reliability for the agent.
  • Karpathy critiques current training methods like reinforcement learning for their inefficient, 'blunt' feedback signals, calling for richer, more granular supervision.
  • He projects that AI will follow a path of 'continuity over rupture,' blending into long-term economic trends rather than causing a sudden, dramatic spike in GDP growth.
  • The analogy of self-driving cars illustrates AI's real-world challenges with 'edge cases,' requiring painstaking, localized training.
  • A core, under-discussed problem underlying many of AI's limitations is the challenge of creating durable, effective memory for LLMs.
  • Karpathy argues we should think of AI as a 'controllable tool,' not a creature we are trying to evolve, steering clear of biological metaphors.

Article

The Conversation That Rocked Silicon Valley

For a few days, it was the only thing anyone in tech seemed to be talking about. In a wide-ranging podcast interview, Andrej Karpathy—a founding research scientist at OpenAI and one of the most respected minds in the field—delivered a dose of cold water on the feverish hype around artificial intelligence.

Headlines seized on his most provocative phrases. Current AI agents are 'slop.' Truly useful, autonomous agents are a 'decade away.' The media framed it as an insider popping the AI bubble, a rebuttal to the relentless optimism machine. The reaction was so strong that Karpathy later clarified on X (formerly Twitter) that he “did not intend to ‘pop the bubble’ or anything,” but was simply trying to have a grounded conversation about building AI systems .

Now that the dust has settled, it’s clear the controversy stemmed from a collision of two different frames of reference: the view from the cutting-edge research frontier and the view from the trenches of practical application. Karpathy was speaking from the former, outlining the fundamental, decade-long challenges that researchers face. But for builders, engineers, and businesses working with AI today, his critique isn't a stop sign. It’s a roadmap—one that highlights exactly where we need to focus our efforts to build remarkable things with the tools we already have.

Abstract representation of messy AI data being structured by elegant architecture.

Karpathy's critique highlights the gap between AI's raw capabilities and the polished, reliable systems required for real-world application.

License: Unknown

Deconstructing the Core Critiques

To understand the takeaways, we first have to understand Karpathy's specific points, which go far beyond a single soundbite.

Claim 1: Useful Agents Are a Decade Away (The 'Slop' Problem)

This was the headline-grabber. When Karpathy says truly 'useful' agents are a decade out, he’s not saying they can't perform tasks today. He’s defining 'useful' from a researcher's perspective: an agent that works reliably out of the box, without extensive hand-holding. He argues that current agents fundamentally lack three things:

  1. Memory: They don't inherently learn from interactions or remember past context in a durable way.
  2. Robustness: They can be brittle and fail when faced with unexpected inputs or 'edge cases.'
  3. Reliability: Their performance can be inconsistent, making them difficult to trust for mission-critical, autonomous operations.

This is the 'slop' he refers to—the gap between a flashy demo and a production-ready system that works every time. From his vantage point, closing that gap requires fundamental breakthroughs, not just incremental improvements.

Claim 2: LLMs Have 'Cognitive Deficits'

Karpathy pointed to the way we train Large Language Models (LLMs) as a core source of their limitations. During pre-training, a model learns by predicting the next word in a massive dataset. The feedback is incredibly simple: was the prediction right or wrong?

He memorably described this as trying to suck supervision 'bits through a straw.' The learning signal is sparse and lacks nuance. It’s a difficult and inefficient way to learn complex reasoning. This leads to what he calls 'cognitive deficits'—the model can mimic intelligent behavior but lacks a deeper, more flexible understanding.

This critique extends to reinforcement learning (RL), a technique used to fine-tune models. He described it as a 'blunt instrument' because the feedback (a simple positive or negative reward) is often applied to a long sequence of actions, making it hard to assign credit or blame to any single step. While he called the method 'terrible,' he also admitted he couldn't think of a better alternative right now.

Claim 3: AGI Won't Cause an Economic Miracle

Perhaps most controversially for the techno-optimists, Karpathy pushed back on the idea that Artificial General Intelligence (AGI) will create a 'step function' in economic growth. His base case is that AI’s impact will blend into the long-term trend of automation, contributing to the roughly 2% annual GDP growth we’ve seen for decades, rather than causing a sudden explosion to 8% or more.

[[IMG2]]

He points to history for evidence. The arrival of the personal computer and the internet in the 1990s profoundly changed society, yet this transformation never appeared as a dramatic spike in productivity statistics. This phenomenon is so well-known it has a name: the Solow Productivity Paradox . Similarly, the mobile and social web revolutions didn't radically alter the baseline growth curve.

Karpathy’s view is one of continuity, suggesting AI is the next chapter in a long story of technological integration, not a complete rupture from the past. This stands in contrast to more dramatic predictions, including from executives at labs like Anthropic, who have suggested AI could bring about very rapid shifts in the economy and labor market .

The Builder’s Reality: Finding Value Amidst the Flaws

If you’re building with AI, it’s easy to hear Karpathy’s critique and feel discouraged. But that’s the wrong takeaway. His points are valid, but they describe the raw material, not what can be built with it.

Architecture is the Bridge to Value

The irony is that Karpathy is right: agents, on their own, are sloppy. But that doesn't prevent them from creating enormous value. Companies are already reporting massive efficiency gains by deploying AI systems today. Klarna, for instance, reported its AI assistant handles the work of 700 full-time agents, managing two-thirds of all customer service chats .

The key is that these successful deployments don't just plug in a raw LLM and hope for the best. They treat the agent’s weaknesses as engineering problems to be solved with smart architecture.

This is the discipline of memory engineering. If an agent lacks memory, you build a system to provide it. You design databases, state machines, and retrieval mechanisms that give the agent the context it needs, when it needs it. If an agent isn't reliable, you build validation layers, human-in-the-loop workflows, and fallback logic. You architect for the agent you have, not the agent you wish you had.

[[IMG3]]

The Self-Driving Parallel: Progress is Incremental, Not Instant

Karpathy uses self-driving cars as a prime example of AI's real-world difficulties. A system like Waymo can’t just learn to drive in one city and then instantly deploy everywhere. It has to painstakingly map and learn the unique challenges of every new location—the tricky intersections, the local driving habits, the unexpected construction. Each city is a universe of edge cases.

This highlights the brittleness he’s talking about. Yet, Waymo is not waiting. It is operational in cities like Phoenix and San Francisco and is actively expanding to Los Angeles, Austin, and beyond . It is solving the problem piece by piece, city by city.

This is a powerful metaphor for building with AI agents. We don't need to solve the grand problem of general intelligence to create value. We can bite off specific, well-defined problems and build systems that solve them reliably, one use case at a time.

Four Deeper Takeaways Everyone Missed

Beyond the main controversies, Karpathy’s conversation offered several profound points that got lost in the noise.

1. Embrace Continuity Over Rupture

His gradualist economic forecast is a call for a more disciplined approach to planning. Instead of betting on magical, overnight transformations (or doomsday scenarios), assume a future of steady, compounding improvements. The boring, foundational work you do today to ensure reliability and structure will remain relevant. This mindset encourages building sustainable systems rather than chasing speculative leaps.

2. The Real Critique of Reinforcement Learning

Karpathy isn't 'anti-RL.' He is critiquing the current implementation of it, which relies on sparse, trajectory-level signals. His call to action is for researchers to develop methods for finer-grained supervision. Imagine being able to give an AI feedback not just on an entire essay ('this is bad'), but on a specific sentence or word choice ('this verb is weak'). This is a call to make RL richer and more precise, not to abandon it .

3. Memory is the Master Problem

Coming back to a central theme, Karpathy argues that an agent cannot learn like a human if it cannot remember like a human. He positions the lack of durable memory as a root cause of many other issues, from learning inefficiency to a lack of robustness. This suggests that breakthroughs in how LLMs store, retrieve, and update knowledge could unlock a cascade of other capabilities. For builders, it reinforces that focusing on 'memory engineering' is one of the highest-leverage activities right now.

A digital brain with a key unlocking its central memory component, causing it to light up.

Karpathy identifies durable memory as a foundational challenge; solving it could unlock a cascade of advancements in AI learning and reliability.

License: Unknown

4. We're Building Tools, Not Creatures

In a fascinating aside, Karpathy pushed back against the popular analogy of AI development as a form of evolution, comparing LLM training to the compression of knowledge in DNA. He argued this metaphor is misleading and potentially harmful.

His point is crucial: we are not trying to build artificial animals or replicate biological evolution. We are trying to build useful and controllable tools. Biological metaphors can lead us to optimize for the wrong things, like emergent autonomy, instead of what we actually need: reliability, predictability, and safety.

Why It Matters: A Roadmap, Not a Rebuke

Andrej Karpathy’s analysis isn't pessimistic; it's realistic. He provided a clear-eyed view from the frontier, outlining the deep scientific challenges that lie ahead. But for those of us applying AI today, his critique is a gift.

It tells us where the pitfalls are. It validates the hard, often unglamorous work of building robust architecture. It encourages a focus on practical, incremental progress over waiting for a mythical AGI to arrive and solve everything.

It is, in fact, the decade of agents. Not because the problem is solved, but because it is so clearly defined. We have powerful, if flawed, new capabilities in our hands, and a massive runway to build systems that compensate for those flaws and deliver incredible value along the way.

Citations

  • Post by Andrej Karpathy on X - X (formerly Twitter) (org, 2024-05-29) https://x.com/karpathy/status/1795859938922143999
  • Primary source where Karpathy clarifies his intent behind the podcast comments, stating he did not mean to 'pop the bubble.'
  • The productivity paradox of information technology - National Bureau of Economic Research (NBER) (whitepaper, 2005-08-01) https://www.nber.org/papers/w11523
  • A foundational paper by Susanto, Lee, and Lee discussing the Solow Paradox, which provides academic context for Karpathy's point about technology not always showing up in GDP statistics.
  • Evidence - The House of Lords Communications and Digital Committee - UK Parliament (gov, 2023-11-21) https://committees.parliament.uk/oralevidence/13800/pdf/
  • Testimony from Dario Amodei, CEO of Anthropic, discussing the potential for 'dramatic' economic benefits and labor market shifts from AI, providing a counterpoint to Karpathy's gradualist view.
  • Klarna AI assistant handles two-thirds of customer service chats in its first month - Klarna (org, 2024-02-27) https://www.klarna.com/international/press/klarna-ai-assistant-handles-two-thirds-of-customer-service-chats-in-its-first-month/
  • Provides a concrete, high-profile example of an AI agent delivering massive, quantifiable business value, supporting the argument that today's agents are useful despite their limitations.
  • Where Waymo is driving - Waymo (org, 2025-11-15) https://waymo.com/waymo-driver/
  • Official source listing Waymo's current and expanding operational cities, verifying the self-driving car analogy and its incremental rollout strategy.
  • Illustrating Reinforcement Learning from Human Feedback (RLHF) - Hugging Face (documentation, 2022-12-12) https://huggingface.co/blog/rlhf
  • Provides a clear, accessible explanation of the RLHF process, which helps readers understand the mechanics behind Karpathy's critique of 'sparse' and 'blunt' reward signals.
  • Useful AI agents are ~a decade away - Andrej Karpathy | The Dwarkesh Podcast - Dwarkesh Patel (news, 2024-05-24) https://www.youtube.com/watch?v=Q_t-s12bQ_I
  • The original source material for the entire discussion, containing Karpathy's full, unedited comments on AI agents, RL, and economic growth.
  • The state of AI in 2023: Generative AI’s breakout year - McKinsey & Company (whitepaper, 2023-08-01) https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-in-2023-generative-ais-breakout-year
  • Provides broad industry context on AI adoption and reported ROI, showing that while generative AI is new, companies are already seeing cost savings and value, which supports the article's thesis.
  • Andrej Karpathy, OpenAI founding member, is leaving the company again - The Verge (news, 2024-02-13) https://www.theverge.com/2024/2/13/24072229/andrej-karpathy-openai-founding-member-is-leaving-the-company-again
  • Establishes Karpathy's credentials and role as a 'founding member' of OpenAI, lending weight to his perspective.
  • Beyond the productivity paradox - Science (journal, 2017-11-17) https://www.science.org/doi/10.1126/science.aao4310
  • An academic article by economist Erik Brynjolfsson discussing the modern productivity paradox, explaining why AI's benefits might be substantial but slow to appear in official statistics due to mismeasurement and implementation lags.

Appendices

Glossary

  • AI Agent: An AI system designed to perceive its environment and take autonomous actions to achieve specific goals. Unlike a simple chatbot, an agent can have memory, planning capabilities, and the ability to use tools.
  • Pre-training: The initial, computationally intensive phase of training a large language model. The model learns grammar, facts, and reasoning abilities by processing vast amounts of text data, typically by learning to predict the next word in a sentence.
  • Reinforcement Learning (RL): A machine learning training method where an AI agent learns to make a sequence of decisions by receiving rewards or penalties for its actions. In LLMs, this is often used to align the model's outputs with human preferences (RLHF).
  • Productivity Paradox: The apparent contradiction between dramatic advances in technology and the slow growth of productivity as measured in national economic statistics. It suggests a lag between innovation and its measurable economic impact.

Contrarian Views

  • Some AI leaders and futurists maintain that AGI could arrive much sooner than a decade and will likely cause a 'step function' change in economic growth, arguing that its ability to automate cognitive work is fundamentally different from past technologies.
  • The rapid, compounding progress in model capabilities (e.g., the jump from GPT-3 to GPT-4) suggests to some that the 'cognitive deficits' Karpathy mentions could be overcome by scale and architectural innovations faster than he predicts.
  • There is a significant camp of researchers who believe that emergent, creature-like properties are not only unavoidable but potentially desirable for achieving true general intelligence, directly opposing Karpathy's 'tool-making' philosophy.

Limitations

  • This analysis is based on a single, long-form interview. Karpathy's views may be more nuanced than can be captured in one conversation.
  • The field of AI is evolving at an unprecedented rate. Assessments of timelines and capabilities can become outdated within months, not years.
  • The distinction between a 'researcher's perspective' and a 'builder's perspective' is a useful heuristic but is also a simplification; many individuals and teams operate in both modes.

Further Reading

  • The Dwarkesh Podcast: Useful AI agents are ~a decade away - Andrej Karpathy - https://www.youtube.com/watch?v=Q_t-s12bQ_I
  • Artificial Intelligence and the Modern Productivity Paradox: A Survey of the Microeconomic Evidence (NBER) - https://www.nber.org/papers/w25316
  • Attention Is All You Need (The original Transformer paper) - https://arxiv.org/abs/1706.03762

Recommended Resources

  • Signal and Intent: A publication that decodes the timeless human intent behind today's technological signal.
  • Thesis Strategies: Strategic research excellence — delivering consulting-grade qualitative synthesis for M&A and due diligence at AI speed.
  • Blue Lens Research: AI-powered patient research platform for healthcare, ensuring compliance and deep, actionable insights.
  • Outcomes Atlas: Your Atlas to Outcomes — mapping impact and gathering beneficiary feedback for nonprofits to scale without adding staff.
  • Qualz.ai: Transforming qualitative research with an AI co-pilot designed to streamline data collection and analysis.

Ready to accelerate your customer research?

Get insights in 24 hours, not 24 days. Your first 5 interviews are free.