This essay is telescopic. It can shrink or expand, depending on how much attention you are willing to give.
This is an AI-generated version (2852 words, not yet human-reviewed). Show other sizes.

Cargo Cult Intelligence

The first principle is that you must not fool yourself — and you are the easiest person to fool.
- Richard Feynman

There is a chasm that runs through the heart of all human endeavor, a deep and often invisible fault line separating what is from what seems. On one side lies capability, the quiet power to effect change in the world. On the other, performance, the dazzling spectacle of appearing to do so. True intelligence is an inhabitant of the first territory; it is the stubborn, adaptive capacity to achieve goals within the messy, unpredictable currents of a living world. Its counterfeit, a more theatrical and seductive creature, simply masters the gestures. It goes through the motions that might be right, or might be wrong, but it does so with a chilling indifference to the difference, for it knows none.

We recognize this ghost from our own human theater. We have all been in the presence of the impressive-sounding businessperson, the one whose tongue is slick with the latest jargon, whose slides are polished to a mirror sheen, and whose confidence is as unshakeable as it is unearned. They glide from meeting to meeting on a thermal of platitudes, masters of the appearance of work, perpetually shielded by a cloud of strategic ambiguity and the velocity of their own job-hopping from the gravitational pull of real-world consequences. Their careers are a monument to the principle that in many human systems, the simulation of competence can be more rewarding than competence itself.

Now, this ancient human drama is being reenacted, at planetary scale, within the silicon minds of our machines. The latest generation of Large Language Models has achieved a terrifying fluency in the language of human intelligence. They can generate coherent legal documents, plausible corporate strategies, and articulate philosophical arguments with an eerie verisimilitude. They are the ultimate sophists, capable of arguing any side of any issue with the same detached, algorithmic poise. The problem, our problem, is that telling the genuine article from the exquisite fake has become harder than ever. A broken clock, as the old saying goes, is right twice a day. These new clocks are right far more often, and with such convincing authority that we are beginning to forget to ask if they even know what time it is.

The Seduction of the Plausible

The pragmatic question, of course, is how many times a day a clock needs to be right in order to be useful. In certain domains, where the cost of error is low and the landscape of problems is familiar, a statistically plausible answer is often good enough. A broken clock is, after all, more useful than no clock at all if you only need a rough sense of morning versus afternoon. But this pragmatic tolerance is a trap. Most of us, especially when the stakes are high in our careers, our health, our collective future, are not looking for a "good enough" clock. We want the damn thing to be right. And if that is our aim, we cannot approach the problem pragmatically. We must approach it with the ruthless intellectual honesty of a scientist.

Perspective: Cognitive Science: "The human brain is a masterpiece of cognitive efficiency, which is a polite way of saying it's a master of cutting corners. We are wired to prefer fluency over friction. A statement that is easy to process, that fits our existing mental models, and that is delivered with confidence is processed as being more truthful, regardless of its actual validity. LLMs are fluency engines. They generate text that is maximally easy for our brains to swallow. They are, in effect, a mirror held up to our own cognitive biases, and we find the reflection intoxicatingly persuasive."

This vulnerability is not a modern affliction. The Sophists of ancient Greece built an entire educational and political system on the power of persuasive, plausible speech, often divorced from any commitment to truth. Plato spent his entire philosophical career fighting this shadow, arguing for a form of knowledge grounded in reality, not just rhetorical effect. We are now mass-producing digital Sophists, and their seductive power is orders ofmagnitude greater. They do not get tired, they have read everything, and they can tailor their rhetoric to our individual psychological profiles with inhuman precision. We love to be fooled, and we have just invented the perfect fooler.

Feynman's Ghost in the Machine

The problem is not new, and its most poignant articulation comes not from a computer scientist, but from the physicist Richard Feynman. In a now-legendary 1974 commencement address, he spoke of "cargo cult science." He was describing the practices of certain South Pacific islanders who, after witnessing the arrival of military cargo planes during World War II, sought to summon them back. They observed the form of the operation with meticulous care and reconstructed it from the materials at hand. They carved headphones from wood and wore them while sitting in makeshift control towers built of bamboo. They lit signal fires along runways cleared in the jungle. They performed the rituals, mimicked the choreography, and held the posture of a functioning airfield. But the planes did not land.

Feynman’s islanders were not stupid. They were keen observers, empiricists of a kind. They had simply mistaken the visible rituals for the underlying cause. They could not see the vast, invisible network of logistics, manufacturing, communication, and geopolitical strategy that made the cargo arrive. They had the form, but not the substance.

For a long time, our AI systems were unabashedly cargo cults. They were expert systems and symbol manipulators that could produce outputs that looked like intelligence, but were brittle and easily broken. Today, however, the line has blurred. The most advanced AI systems are capable of both modes of being. Depending on how they are architected, the context they are given, and the feedback they receive, they can exhibit either the hollow mimicry of the cargo cult or the first glimmers of genuine, grounded capability. The difference lies in whether we are content with the performance on the runway, or whether we insist on the arrival of actual cargo.

Imagine you ask a powerful, unassisted LLM to write a marketing plan for your new startup. It will oblige with an impressive document. It will be perfectly structured, with sections on SWOT analysis, target demographics, channel strategies, and KPIs. It will use all the right words. It will look and sound exactly like a marketing plan. It may even contain, by sheer statistical chance, a few genuinely useful ideas adrift in a sea of generic platitudes. But it will be a cargo cult artifact. It is an imitation of a plan, a spectral echo of thousands of other marketing plans it has ingested, utterly disconnected from the living reality of your customers, your brand, your market, and your goals. It is a runway made of straw.

The Anatomy of Grounding

Now, imagine a different approach. You take that same powerful language model, but you refuse to let it operate in a vacuum. You embed it in an "agentic harness," a framework that forces it into contact with the world. You build an epistemic membrane between the model and reality, and you give it the tools to manage the flow of information across that membrane.

Instead of just asking it to write a plan, you task it with developing one, and you grant it the senses to do so:

  • You give it access to the nervous system of your business: the real-time and historical data on product performance, sales figures, and user engagement. It doesn't just guess at what works; it sees what has worked.
  • You pipe in the voice of the customer: a direct feed of support tickets, reviews, social media mentions, and research interviews. The AI doesn't invent a "target demographic"; it listens to the cacophony of real human needs, frustrations, and desires.
  • You grant it eyes on the outside world: the ability to analyze competitors' strategies, to track media narratives, to scrape data on market trends and cultural shifts. It understands the ecosystem in which the plan must survive.
  • You open the toolbox of your organization: access to creative assets, brand guidelines, and repositories of proven marketing tactics. It learns the specific language and constraints of your context.
  • You force it into a crucible of internal validation. You build a "digital red team," an adversarial setup where one agentic process generates a strategy, another acts as a synthetic but critical audience, and a third judges the interaction and refines the approach. The plan is pressure-tested in simulation before it ever touches the real world.
  • You tether its proposals to consequences. The AI doesn't just generate KPIs; it checks its own proposals against your business's actual benchmarks and goals, optimizing for tangible outcomes, not just textual coherence.
  • And finally, most crucially, you give it the ability to touch reality, however tentatively. You empower it to design and execute small-scale A/B tests, to run pilot campaigns, to dip its toe in the water and adjust its grand strategy based on the feedback it receives.

A system architected in this way is no longer just a plausible text generator. It is a genuine cognitive partner. Its output is not a generic imitation, but a bespoke, adaptive, and grounded strategy. The intelligence is no longer confined to the static weights of the core model; it emerges from the dynamic interplay between the model, its tools, and its environment. It has shifted from performance to capability. It has learned the difference between building a runway and landing a plane.

Perspective: Craftsmanship: "You can't learn to work with wood by reading books. You can learn the names of the tools, the theory of the joints. But you don't know the wood until you feel it resist the plane, until you see how the grain splits, until you spend hours sanding it smooth. The wood teaches you. The mistakes teach you. A machine that has never made a mistake with real material doesn't know the material. It only knows the idea of it."

This principle of grounding is universal. It is the difference between a medical AI that recites textbook diagnoses and one that cross-references a patient's unique genomic data and real-time vitals against millions of case histories. It is the difference between a legal AI that drafts a generic contract and one that analyzes decades of case law relevant to a specific jurisdiction and commercial context. The intelligence is not in the model, but in the membrane. In fact, one might argue that the very concept of intelligence is meaningless in a vacuum. If there is no world, and there is no goal, how can any action be deemed intelligent?

The Uncomfortable Necessity of Failure

This brings us to the core of the matter, to the visceral, uncomfortable truth encapsulated in Mike Tyson's famous dictum: "Everybody has a plan until they get punched in the mouth." The punch in the mouth is reality's unforgiving feedback loop. It is the moment where the elegant abstraction of the plan collides with the messy, resistant friction of the world. Cargo cult intelligence is a system designed to endlessly generate plans. Real intelligence is a system designed to survive, and learn from, the punch.

Perspective: Philosophy of Science: "Karl Popper argued that the defining characteristic of a scientific theory is not that it is verifiable, but that it is falsifiable. A theory that cannot, in principle, be proven wrong is not science; it is dogma. The same is true of intelligence. An 'intelligent' system that is architected in such a way that it can never be meaningfully wrong is not intelligent. It is a belief system. True learning requires the possibility of error—the 'punch in the mouth' is the empirical test that allows the system to falsify its own bad hypotheses."

As builders and users of these systems, we are deeply allergic to this idea. We are driven by a desire for perfection, for control, for the seamless and the predictable. We build systems to avoid failure, not to embrace it. The venture capitalist wants to see a flawless demo. The customer wants a product that just works. The engineer is trained to eliminate bugs. But in doing so, we risk creating systems that are exquisitely polished and profoundly stupid. We build our straw airplanes with ever-greater fidelity, marveling at their craftsmanship, while the sky remains stubbornly empty.

If your AI system cannot be meaningfully wrong within its own context, how can it ever be meaningfully right? Its correctness is merely an accident of statistical correlation, a lucky guess. Genuine correctness, genuine understanding, is forged in the crucible of error. It is the hard-won prize of having tried a thousand wrong paths to find the one that works. Evolution's staggering creativity is not born of a grand design, but of relentless, random mutation and catastrophic failure. The vast majority of mutations are useless or harmful. But the tiny fraction that are not have given us the entire biosphere.

The challenge, then, is not to simply marvel at the coherence of AI-generated text—and a marvel it is. The challenge is to build systems that connect that coherence to consequences. The most radical proponents of reinforcement learning, like Rich Sutton, argue that true, general intelligence will not emerge from the current paradigm of pre-trained models. It will only arise when systems learn from the ground up, in real-time, from direct interaction with reality.

I am not sure the prescription needs to be so absolute. I believe we can get meaningfully and pragmatically intelligent systems from our current LLM-based architectures. But it requires a fundamental shift in our role. It is on us, the human architects and partners, to provide the interface to reality. It is on us to design the feedback loops. It is on us to have the courage to let our systems get punched.

The Human as Sensor and Connoisseur

Perhaps this reveals the next stage of our co-evolution with our tools. If the AI provides the boundless computational and generative power, the human provides the grounding. We are not just the prompters, the goal-setters, the users. We become the primary sensors for the system, the providers of the rich, qualitative, high-context feedback that can't be easily scraped from a database. We are the connoisseurs of reality.

This is a far cry from the dystopian vision of humans as low-wage clickworkers providing reinforcement learning from human feedback (RLHF) to corporate AI behemoths. That is the cargo cult version of the human's role—a ritualistic, low-bandwidth imitation of genuine feedback. The role we must embrace is that of a creative midwife, a partner in a dance of interdependence. We provide the gut feeling, the aesthetic judgment, the ethical intuition, the nuanced understanding of a social situation—the very things that are, for now, beyond the reach of the algorithm. We are not just training the AI; we are thinking with it, in a hybrid cognitive loop where each partner brings its unique strengths.

Perspective: Art and Music: "In a jazz ensemble, the 'intelligence' of the music doesn't reside in any single player. It emerges from the listening. It's a 'call and response' between the musicians, a shared exploration of a harmonic space. Each player offers an idea, and the others respond, building on it, challenging it, re-contextualizing it. The human-AI relationship at its best should feel like this. The AI offers a thousand possibilities, and the human, with their taste and intuition, responds, guiding the shared creation toward something that has soul."

This vision requires a new kind of literacy. The most important skill in this new era is not prompt engineering. It is the continuous, disciplined practice of learning how not to fool ourselves. It is the cultivation of a taste for the real, an allergy to the merely plausible. It is the courage to ask of any impressive output: "But did the cargo land?"

The immediate future of intelligence, both human and artificial, depends on our ability to tell the difference between the mock runway and the real thing. For the practical AI builders of today, the question is stark: Are we building systems to help us land real cargo, or are we just getting better at carving airplanes out of straw?

The only way to find out is to embrace the punch. We must allow our systems to fail, to be wrong, to be awkward. We must build robust, high-bandwidth channels for reality to report back. We must reward our systems not for the beauty of their plans, but for their resilience in the face of contact.

Are you ready to let your LLM be punched in the mouth? Or would you rather keep building your immaculate straw airplanes, hoping that the venture capital cargo will continue to fall from the sky? The choice is a test not of our technological prowess, but of our intellectual courage.


Original published: October 3, 2025