The Fairest Test

From motivated reasoning to tailored experiments

May 12, 2026

I grew up in South Florida, surrounded by people allergic to the cold. I had only set foot in the Northeast twice before. My first meeting was with Milt Lodge.

I had spent hours in the library reading The Oxford Handbook of Political Psychology and The Affect Effect. Lodge and Taber. Taber and Lodge. By the time I arrived at Stony Brook, I felt like I was about to meet the high priests of political psychology. And I was.

I walked into Milt’s office. The Long Island Sound was visible in the distance. Three mahogany chairs. A large wooden desk. A small ceramic phrenology bust. My hands were trembling.

“I’ve read a lot of your work,” I said.

Milt was electric. He rattled through decades of work on political cognition, building to a sharp account of the John Q. Public model he developed with Chuck Taber. Then he turned to a new study on judicial symbols and the unconscious activation of legitimacy.

“Wow,” I said under my breath.

He looked back at me and said, “Wow is right!”

The next five years felt like induction into an esoteric guild. At the center was motivated reasoning: the idea that people seek out information and interpret evidence through the lens of their political commitments. Lodge and Taber’s Rationalizing Voter was the sacred text.

After receiving my PhD, my own research wandered: immigration, geographic context, authoritarianism. But the pandemic marked a break. My work with Ethan Porter and Tom Wood on misinformation during the disorienting politics of Covid-19 took priority. Across studies in the United States and abroad, we kept finding, much to my surprise, that beliefs often moved in the direction of evidence, even among the skeptical.

It felt like a betrayal of the Stony Brook model.

So I wanted to give motivated reasoning its most generous possible test. What would Milt Lodge and Chuck Taber say? I could hear them in the back of my head: These aren’t beliefs worth defending. The information isn’t emotionally charged enough. The issues are too distant.

The fairest test, I thought, would have to start by asking each person what they actually cared about.

I toyed with potential designs and kept running into a wall. It would be impractical to expose every respondent to a handwritten argument tailored to each of their considerations.

Then the technology changed.

GPT-3 made the design plausible. The model could follow instructions and produce coherent arguments. On-the-fly tailoring was no longer a fantasy. We could ask people what mattered to them, then generate treatments that spoke directly to those political commitments.

Theories like motivated reasoning are premised on concepts that vary across people. Attitudes must be sufficiently strong to defend. Yet, when we expose all participants to the same stimuli, we get a mixture of participants: some who care deeply about the particular issue and some who could not care less.

Since 2022, Patrick Liu and I have been trying to solve that problem: how to design experiments for theories that depend on people caring about different things. This work will be the core of our Cambridge Element Tailored Experiments: Personalized Interventions Using Generative AI, edited by Jamie Druckman.

In a tailored experiment, the treatment is not fixed in advance. We measure something specific about each respondent: a belief they hold, a concern they have, their identity. Then we ask a language model to generate the stimulus from there.

The stimulus is no longer written based on what experts assume matters most. What each person sees is personally relevant, built from their own beliefs and concerns. The reasons they doubt medicine. The version of the immigration debate that lives in their head. The discussion they would have wanted to have with peers or family members.

The rest of the design is familiar. Random assignment. A control condition. The same causal goals.

Coming back to our test, what we found was conditional. Anodyne counterarguments targeting deeply held issues decreased certainty and attitude strength. Vitriolic counterarguments produced the opposite: attitude polarization, the backfire effect motivated reasoning was always supposed to predict. The findings challenged the strongest version of motivated reasoning, but they also revealed an interesting corner case: people may move with evidence under ordinary conditions, but resist when arguments viciously attack what they care about.

For me, the project started with a feeling of disloyalty. The evidence kept moving away from the work I was trained on. Motivated reasoning was always a theory about people defending particular commitments. Tailored experiments are a way to study that without assuming everyone is defending the same attitude.

Years after that first meeting, I still think about Milt’s office. The Long Island Sound through the window. The phrenology bust. And Milt, looking back at me, saying, “Wow is right.”

The “Stony Brook school” of political psychology had its own commitments, its own detractors, its own uneasy place inside political science, and its own canon.

The highest form of respect for a theory is not empty reverence, but a fair test. And sometimes, the fairest test requires building the design it deserves.

New Instruments

Discussion about this post

Ready for more?