UNTOLD · Mind · NO. M01

The Man Who Wore the Juice

A famous psychology finding has been misread for two decades — the truth is quieter and more uncomfortable.

Share
The Man Who Wore the Juice

On the afternoon of January 6, 1995, a heavyset man named McArthur Wheeler walked into the Mellon Bank in Pittsburgh, pointed a gun at the teller, and walked out with cash. An hour later, he did it again at a second branch a few miles away. He wore no mask. He made no attempt to dodge the security cameras. He smiled, in fact, at one of them.

The footage played on the local eleven o’clock news. By midnight, tipsters had named him. By morning, police were at his door. When officers showed Wheeler the surveillance still, he stared at it with the expression of a man whose physics had just betrayed him.

“But I wore the juice,” he told them.1

Wheeler, it emerged, had soaked his face in lemon juice before each robbery. He had heard somewhere — perhaps a misremembered chemistry-class anecdote about invisible ink — that lemon juice could render skin invisible to camera film. He had even tested the theory in his kitchen, snapping a Polaroid of himself. The photo had come back blank, which he took as proof. The likelier explanation, investigators concluded, was that he had pointed the camera at the ceiling.

The story is funny in the way that all confident misunderstandings are funny, and it would have stayed a local curio if a Cornell psychologist named David Dunning had not read about it in the 1996 World Almanac. Dunning was at that moment circling a question he could not quite articulate: why did some people seem to be so spectacularly certain about things they clearly did not understand? Wheeler offered him a beginning. He photocopied the article, handed it to a graduate student named Justin Kruger, and said something to the effect of: there’s a pattern here.

What followed was an experiment, a paper, an internet meme, a misreading, and — eventually — a slow scientific correction that almost nobody outside of academic psychology has noticed. The Dunning-Kruger Effect, as the world now uses the phrase, is not what David Dunning and Justin Kruger found. The thing they did find is stranger, smaller, and more useful.

Unskilled and Unaware

In 1999, Dunning and Kruger published their results in the Journal of Personality and Social Psychology under the title “Unskilled and Unaware of It: How Difficulties in Recognizing One’s Own Incompetence Lead to Inflated Self-Assessments.”2 The paper described four experiments. Cornell undergraduates were tested on humor (rating jokes against a panel of professional comedians), logical reasoning (an LSAT subsection), and English grammar. After each test, students were asked to estimate, in percentile terms, how well they had done relative to their peers.

The pattern that emerged was consistent and a little melancholy. The students who scored in the bottom quartile — the genuinely poor performers — estimated themselves to be in roughly the 62nd percentile. People who answered fewer than one in five logic questions correctly believed they had outperformed three-fifths of the room. Top performers, meanwhile, did something almost equally interesting in the opposite direction: they slightly underestimated themselves, apparently assuming that questions they had found easy must have been easy for everyone.

Dunning would later describe the central finding as a “double curse.” The cognitive equipment required to perform a skill well is, by and large, the same equipment required to evaluate performance of that skill. If you cannot tell a well-constructed argument from a sloppy one, you cannot tell whether your own argument is well-constructed. If you have no ear for grammar, you have no ear for your grammar. The bottom student in the class is not, by this account, arrogant. They are missing the very faculty that would let them notice they are missing it.

The paper won the Ig Nobel Prize in Psychology in 2000 — the satirical award given to research that “first makes you laugh, then makes you think” — and proceeded to become one of the most cited papers in modern psychology, racking up tens of thousands of citations across two decades.3 Its real-world appeal was obvious. Anyone who had ever sat through a meeting could think of a colleague who proved the thesis.

The Mountain That Was Not There

Somewhere in the years that followed, something happened to the finding. It escaped the journal and entered the wild.

If you have encountered the Dunning-Kruger Effect on the internet, you have almost certainly encountered it in the form of a graph. The graph shows a confidence axis rising steeply to a tall peak labeled “Mount Stupid,” plunging into a chasm called the “Valley of Despair,” and then climbing gradually up the “Slope of Enlightenment” toward a plateau of expertise. It is a beautiful image. It is endlessly meme-able. It is the version your uncle sends in the family group chart to mock his ideological opponents.

It is also not in the paper.

The original 1999 study contains no such curve. Dunning and Kruger did not find that confidence rises to a manic peak among the marginally informed and then collapses with experience. The cartoon graph appears to have been drawn, somewhere in the late 2000s, by someone conflating the Dunning-Kruger findings with a separate concept from corporate-training literature called the “competence ladder.” Tamara Avant, a psychologist who has written about the misattribution, points out that the chart’s emotional satisfactions — the labelled mountain, the wise plateau — are exactly the satisfactions the original paper denied its readers.4

The real graph from the 1999 paper, reproduced in any honest summary, is much duller. It shows perceived ability plotted against actual ability. The line of self-assessment is roughly flat: nearly everyone, regardless of score, rates themselves a little above average. The line of actual performance climbs from low to high. The gap between the two lines is widest at the bottom and narrowest at the top. There is no peak. There is no valley. There is only a stubborn human tendency, distributed across the whole population, to imagine oneself slightly better than middling.

That is a finding about everyone. The viral misreading converted it into a finding about other people — specifically, the loud and confident other people one wished to dismiss. This conversion may be the most quietly Dunning-Krugerian thing about the Dunning-Kruger Effect.

A Statistical Ghost

In 2008, a mathematician named Patrick McIntyre, together with the psychologists Joachim Krueger and David Mussweiler, raised a question that had been hovering quietly over the field for nearly a decade.5 What if the effect was not really an effect at all? What if it was an artifact of how the data had been arranged?

The argument goes like this. Imagine you score a hundred people on a test where the maximum possible self-estimate is 100 and the minimum is zero. People who score very low cannot, mathematically, estimate themselves below their true score by much — there is no room below them. People who score very high cannot estimate themselves above their true score by much, for the same reason. This is called a ceiling-and-floor effect. Add to it a phenomenon called regression to the mean: any noisy estimate, on average, will drift toward the middle of the scale. Take random numbers, plot them this way, and you will reproduce something resembling the Dunning-Kruger pattern even when no psychological effect exists.

McIntyre demonstrated this with simulations. He generated fake data in which self-assessment was deliberately uncorrelated with performance — that is, where there was no psychological insight or blind spot at all, just noise. The plotted result still showed low scorers “overestimating” and high scorers “underestimating” themselves. The curve emerged from arithmetic, not from any quirk of the human mind.

The finding did not destroy the original research, but it complicated it. Some portion of what Dunning and Kruger had measured was real psychology. Some portion was math impersonating psychology. The hard question — how much was which — remained open.

A more recent attempt at the answer arrived in 2020, when the Australian psychologist Gilles Gignac and the Polish psychologist Marcin Zajenkowski revisited the question with a different methodology.6 Rather than testing people against an arbitrary skill like joke-rating, they measured actual IQ in a sample of 929 adults and asked each participant to estimate their own IQ on the standard scale. They then analyzed the data with techniques designed to control for the statistical artifacts McIntyre had identified.

The famous gap mostly dissolved. What Gignac and Zajenkowski found, once the regression artifacts were stripped out, was a modest positive correlation between people’s self-estimates and their actual intelligence — roughly 0.3. People with lower IQs did, on average, overestimate themselves slightly more than people with higher IQs, but the difference was small. The dramatic, cartoonish Dunning-Kruger pattern was largely an optical illusion produced by the way the original data had been plotted. The dramatic conclusions that had been built on top of it — about democracy, about expertise, about the dangers of the loudly ignorant — had been built on a foundation considerably less sturdy than the meme suggested.

What Was Actually There

It would be tidy to say at this point that the Dunning-Kruger Effect has been debunked, and to file the whole thing away. But the science is more interesting than that.

When the statistical artifacts are accounted for, something does remain. It is smaller than the original paper claimed and far smaller than the internet believes. But people who lack a skill do, on average, exhibit a particular kind of confidence: not the manic certainty of Mount Stupid, but a quiet inability to locate the edges of their own knowledge. They do not know what they do not know. The error is not loudness. The error is not noticing.

Dunning has spent the years since 1999 refining this point in subsequent papers, often complaining mildly that the public version of his work has wandered far from the original.7 In a 2011 review in Advances in Experimental Social Psychology, he and his colleagues described the finding less as a chart and more as a problem of metacognition — the mind’s capacity to think about its own thinking.8 In an unfamiliar domain, the metacognitive apparatus simply has nothing to work with. A novice chess player, presented with a position they have just blundered, cannot see the blunder, because the very pattern-recognition that would flag the mistake is the pattern-recognition they do not yet possess. The blindness is not a personality flaw. It is a structural feature of learning anything from scratch.

What Dunning found, in other words, was less a fact about stupid people than a fact about beginnings. Every domain has a stage at which the learner cannot yet tell their work from competent work. Painters describe this. So do programmers, doctors, translators, jazz musicians. The Australian comedian Tim Minchin once gave a graduation speech in which he said that the people who annoy you most in any room are usually the people one rung below you on the same ladder — confident enough to be loud, unaware enough not to be embarrassed. He was, without naming it, describing the original Dunning-Kruger Effect rather accurately.

The Expert’s Mirror

There is a second half to the original finding that almost nobody quotes, and it is the half that gives the whole thing its weight.

In the 1999 paper, top performers were also miscalibrated — they tended to underestimate their relative standing.2 Asked to rate themselves against their peers, students who had aced the grammar test placed themselves only slightly above average. Their error was not arrogance but a kind of generous projection: they assumed that what was easy for them must be easy for everyone, and so they discounted their own ability. This phenomenon has since been studied independently and given its own name, the false consensus effect, but Dunning and Kruger had pointed at it first.9

The consequence is uncomfortable. The same study that the internet uses to mock the overconfident also predicts that experts, on average, will doubt themselves more than they should. The grammar student in the 90th percentile felt no closer to the truth than the student in the 30th. Both felt about average. One was right by accident, the other by the same accident in reverse.

This is the part of the original work that the meme cannot accommodate. The viral graph requires a hero — the wise expert on the plateau, who has crossed the valley and earned their humility. The actual data show no such hero. Expertise does not produce calibrated self-knowledge. It produces a different shape of miscalibration, one in which people who genuinely know what they are doing routinely fail to notice that fact.

If you have ever wondered why so many extraordinary practitioners — researchers, writers, surgeons — speak about their own work with what sounds like false modesty, the answer may not be modesty at all. It may be that they are reporting their phenomenology accurately. From inside expertise, the work does not feel impressive. It feels like the obvious thing, which anyone could have noticed if they had looked.

The Quiet Question

What survives, then, after the statistical corrections and the meme deflation, is something less satisfying than the original story and more useful than the cartoon.

The surviving claim is roughly this. Human self-assessment is poorly calibrated across the board. The bottom of any skill distribution tends, slightly, toward overconfidence — not because those people are uniquely deluded but because the faculty that would detect their delusion is exactly the faculty they have not yet developed. The top tends, slightly, toward underconfidence, because expertise tends to dissolve into the texture of the obvious. And the popular image of a confident fool dancing on Mount Stupid is, mostly, a story we tell about other people to avoid telling it about ourselves.

This last point is the one Dunning has been gentlest in making, perhaps because he recognized early how easy it would be to deploy his research as a weapon. In a 2014 essay in Pacific Standard, he noted that the people most likely to share the Dunning-Kruger graph were the people least likely to apply it inwardly.7 The effect, he wrote, was not about the loud guy at the end of the bar. It was about the quiet, settled certainty that anyone — including the researcher, including the reader — carries about domains they have never seriously tested.

The practical implication is small but stubborn. In any field where you have not done the work, your sense of how much you understand is approximately worthless as a guide to how much you actually understand. The only correction available is external: feedback from people who can see what you cannot yet see, error signals from a world that is willing to push back, the slow attrition of being wrong in public. Confidence, calibrated or otherwise, does not produce competence. Only the loop does.

Which returns, in a roundabout way, to McArthur Wheeler and his lemon juice. The thing that undid Wheeler was not the absurdity of his hypothesis. Plenty of useful scientific hypotheses have begun their lives in equally absurd places. The thing that undid Wheeler was that he tested his theory once, badly, in a way that could not refute it, and accepted the result. He pointed the camera at the ceiling. He did not ask the question that would have rescued him.

The question is the cheapest tool in epistemology, and the hardest to use on oneself. It has no special name in the literature. It goes something like: how would I know if I were wrong about this? Asked of any belief, in any domain, it tends to produce one of two answers. Either there is a way — an experiment, a check, a person whose disagreement would matter — or there is not. The beliefs in the second category are the ones to be careful with. They are the lemon juice on the face. They cannot be falsified by the photograph because the photograph was never pointed at anything real.

That is what the Dunning-Kruger Effect, properly understood, leaves behind. Not a chart, not a punchline, not a way of categorizing the people one finds tiresome. A small, recurring suspicion, available at any moment to anyone willing to feel it: that the confidence one is feeling right now might be the kind that comes from knowing the territory, or it might be the kind that comes from never having walked it. The two feel, from the inside, exactly the same.

Watch the companion essay on YouTube
— Companion videoThe same essay, told visually. About seven minutes.

Sources

  1. Fuocco, Michael A. “Trial and Error: They Had Larceny in Their Hearts, but Little in Their Heads.” Pittsburgh Post-Gazette, 1996. — https://www.post-gazette.com/news/crime-courts/1996/03/21/Trial-and-error-They-had-larceny-in-their-hearts-but-little-in-their-heads/stories/199603210146
  2. Kruger, Justin, and David Dunning. “Unskilled and Unaware of It: How Difficulties in Recognizing One’s Own Incompetence Lead to Inflated Self-Assessments.” Journal of Personality and Social Psychology, 1999. — https://psycnet.apa.org/doi/10.1037/0022-3514.77.6.1121
  3. Improbable Research. “The 2000 Ig Nobel Prize Winners.” Annals of Improbable Research, 2000. — https://improbable.com/ig/winners/#ig2000
  4. Jarry, Jonathan. “The Dunning-Kruger Effect Is Probably Not Real.” McGill Office for Science and Society, 2020. — https://www.mcgill.ca/oss/article/critical-thinking/dunning-kruger-effect-probably-not-real
  5. Krueger, Joachim, and Ross A. Mueller. “Unskilled, Unaware, or Both? The Better-Than-Average Heuristic and Statistical Regression Predict Errors in Estimates of Own Performance.” Journal of Personality and Social Psychology, 2002. — https://doi.org/10.1037/0022-3514.82.2.180
  6. Gignac, Gilles E., and Marcin Zajenkowski. “The Dunning-Kruger Effect Is (Mostly) a Statistical Artefact: Valid Approaches to Testing the Hypothesis with Individual Differences Data.” Intelligence, 2020. — https://doi.org/10.1016/j.intell.2020.101449
  7. Dunning, David. “We Are All Confident Idiots.” Pacific Standard, 2014. — https://psmag.com/social-justice/confident-idiots-92793
  8. Dunning, David. “The Dunning-Kruger Effect: On Being Ignorant of One’s Own Ignorance.” Advances in Experimental Social Psychology, 2011. — https://doi.org/10.1016/B978-0-12-385522-0.00005-6
  9. Ross, Lee, David Greene, and Pamela House. “The False Consensus Effect: An Egocentric Bias in Social Perception and Attribution Processes.” Journal of Experimental Social Psychology, 1977. — https://doi.org/10.1016/0022-1031(77)90049-X