Who Answered This Survey?
Deception, Satisficing, and Delegation in the Agentic Era
A bot farm spins up a hundred survey accounts, each linked to a virtual private server running agentic models with headless browsers. The first instance is a real person. She takes the survey at a natural pace while the system records her hover patterns, pauses, and click sequences. The model orchestrator ingests the recording, introduces slight modifications that preserve the unsteady human hand, and repeats the process hundreds of times.
A store manager in Miami gets home from a long shift, turns on The Night Agent, and opens his laptop. An email notification comes in: an invitation to participate in a study about current events for $2.50. He clicks the link, skims the consent form, and starts clicking bubbles. He hits an open-ended question: “Please describe the most important problem facing America today.” He groans, turns on agent mode in his browser, and instructs the model to complete the rest of the survey. He collects his payment and goes back to his show.
These are problems that demand different solutions. The first scenario reflects deception. Of the hundreds of accounts generated, not a single response corresponds to a real person. The second involves delegation: a real respondent is present, but he offloads the work to an agent. Ironically, a data integrity engineer reviewing the data might flag the second case and miss the first.
These are no longer hypothetical scenarios. They reflect the concrete realities of opinion research in the agentic era. The public senses it too: in a March 2026 Verasight poll, 80 percent of respondents said they were at least somewhat concerned that AI bots are answering surveys used to inform public decisions.
I have championed the use of AI in social science research, arguing that it can make questionnaires more dynamic, enable more creative experimental designs, and open up new theoretical possibilities. That is exactly why it is unsettling to watch the same technology erode what Sean Westwood argues is the fundamental assumption underlying opinion research: that a human is on the other end.
But the rush to quell concerns about overtly fraudulent responses sidesteps the more troubling reality. The longer-term disruption to surveys is not a headless browser. It is a delegated opinion dynamic that we have yet to fully reckon with.
Deception
Practitioners are already mapping agentic threats onto a classic mental model of fraud detection. Long before language models, survey panels contended with junk data, duplicate accounts, and click farms gaming incentive systems. Agentic models have raised the stakes, circumventing the attention checks and response-quality classifiers that panels developed over the past decade. Just one year ago, these checks conveniently flagged both low-quality respondents and more rudimentary bots. This is no longer the case.
Among the statistics presented in Westwood’s terrifying article about AI-driven fraud in survey research, one stands out: 99.8 percent. That is the percentage of attention check questions successfully passed by Daneel, Westwood’s agentic system. The defenses the field spent a decade building are, for practical purposes, gone.

This is the same cat-and-mouse dynamic the field has navigated for years, now played at a faster tempo. Investments in higher-quality recruitment methods and filters are likely to accelerate. Fraud is arguably the problem the field already knows how to fight, even if the most effective tools are still unclear.
Agentic delegation, on the other hand, is uncharted.
Delegation
The Miami manager isn’t committing fraud. But somewhere between opening the survey and submitting it, he outsourced his cognition to his agent.
Survey researchers have long recognized that when a question demands more effort than a respondent is willing to invest, he satisfices, opting for low-effort responses rather than giving the question full attention. Now output arising from satisficing is indistinguishable from genuine effort. A satisficing respondent in 2016 straight-lined and wrote a few words. A satisficing respondent in 2026 submits an impeccable paragraph that reads exactly like the response a thoughtful participant would have written.
Fraud detection is the wrong frame for delegation. A participant using an agentic browser to finish a survey is a living, breathing human: opting out of a long matrix of questions, a vague open-ended prompt, or an article he has zero interest in reading. Had the survey been shorter or the incentive higher, he might have answered in his own words. The agent simply offered an easy out.
You can build classifiers to flag AI-generated text, but those classifiers degrade as models improve, and it is trivial to lightly edit text to evade detection. More fundamentally, detection treats delegation as a compliance issue, placing the onus on the participant, when the real problem is design: low incentives, complex questions, and insufficient effort to motivate the respondent.
The instinct is to fight delegation by detecting it and terminating suspect respondents. That instinct is understandable and mostly wrong. If a meaningful share of respondents are going to delegate responses to a model, this may be a signal to redesign the instrument so that genuine participation requires less effort than delegation does.
This means rethinking what we ask for and why. If an open-ended question is so burdensome that a respondent would rather paste a chatbot’s paragraph than write her own, the question may not be measuring what we think it is measuring even in the absence of AI.
One path forward is to offer respondents a lower-effort alternative alongside the open-ended prompt. Show the question as written, but when agentic signals are detected, rather than flagging the respondent, adapt the instrument: “Even if it’s two or three words, what matters most to you on this issue?” The respondent who would have pasted a ChatGPT paragraph instead gives you three words from his own head. Those three words could be unpolished, incomplete, or incoherent, but they’re genuinely his, and that’s more valuable than an inauthentic response.
My own research on tailored experiments suggests that a respondent's most important consideration carries more weight than others in shaping their attitude. A few authentic words about what actually matters to someone may capture far more information than a standardized battery that asks everyone the same ten questions.
A more radical option is to alter the response modality itself. A fifteen-second voice exchange is less effortful than a text box and far harder to delegate. Multimodal models now make this scalable: real-time voice agents that ask questions and respond with follow-ups much like a human interviewer. Alternatively, one could imagine surveys designed to maximize the optimal modality: matching the cognitive load of the response format to the complexity of the question.
Whether voice or some hybrid approach is ultimately most effective, both suggest a move toward easing the load on respondents. But these are incremental fixes. The real shift on the horizon is from short-term delegates to full-time agentic representatives.
The Longer Horizon
Instead of enabling agentic features when confronting difficult questions, future participants might prefer, and even demand from panels, that an agent complete repeat questions or those the model can predict with a high degree of accuracy.
Some people in the industry will inevitably describe these systems as digital twins: agentic representatives meant to stand in for respondents across waves of data collection. That language may be premature. It remains unclear whether any such system can reliably mirror a person’s preferences or preserve the right kind of variation. A model grounded in chat histories and prior responses may still fail in subtle but important ways.
Still, this is a fundamentally different proposition from the version of synthetic survey response generation that has attracted the most scholarly attention so far: prompting a large language model with a demographic profile and asking it to answer as that “person” would. Argyle et al. (2023) introduced the concept of “silicon samples,” showing that GPT-3 conditioned on sociodemographic backstories could approximate subgroup response distributions. But subsequent work has revealed important limitations. Bisbee et al. (2024) demonstrate that these approaches tend to produce too little variance and unreliable joint distributions, with regression coefficients that frequently diverge from those obtained with real survey data. They recover whatever the model’s training data imply about the average opinion of a demographic, which can be helpful for piloting or as a kind of noisy prior, but are far from substitutes for human respondents.
A respondent-grounded agent, if it proves viable, would at least be anchored in an actual individual rather than a demographic stereotype. That does not make it trustworthy by default, and it certainly does not mean it can substitute cleanly for a human respondent. But it is likely an improvement over prompting approaches, and that makes it worth taking seriously.
Polling survived the transition from door-to-door canvassing to telephone interviewing to online panels. Each time, the field adapted not just its technology but its assumptions about what counts as a valid expression of opinion. A voice on the phone was once considered a degraded substitute for the rapport built over the course of a face-to-face interview. An online form was considered inferior to a phone call. In each case, the new mode reflected how people had actually come to communicate, and the field caught up.
The same adjustment is coming, whether researchers are prepared for it or not. In a world where people routinely delegate tasks to AI, asking a model who to vote for and how to navigate relationships, the line between a delegated response and an authentic one will blur until the distinction stops being useful. The field can build instruments worthy of that reality, or it can spend the next decade trying to filter its way back to a world that no longer exists.



This was great and I fully agree: “If a meaningful share of respondents are going to delegate responses to a model, this may be a signal to redesign the instrument so that genuine participation requires less effort than delegation does. This means rethinking what we ask for and why.”
Where I’m a bit less optimistic than you was this part: “Investments in higher-quality recruitment methods and filters are likely to accelerate.” - I don’t know what percentage of buyers of survey data genuinely care about data quality but I’ve been updating my rough estimate downward over time. (One reason being that as response rates for standard probability surveys are in a decades-long freefall, we’ve seen continued pullback of face-to-face data collection instead of… the opposite.)