Research Paper, Research
February 9, 2026
The Price You Pay for Letting AI Shop for You
Author:
Jasmine Rienecker, SR AI Engineer
MSc Mathematics and Computer Science, Oxford University
As AI agents take over searching, negotiating, and transacting in markets, outcomes are shaped less by consumer preference and more by agent design. This shift quietly determines who gets better prices, better options, and better outcomes.
This shift is easy to miss because it arrives wrapped in the familiar promise of efficiency. AI agents can now autonomously search, compare, negotiate, and complete transactions across platforms, collapsing complex, multi-step consumer journeys into unified flows. Tasks, like moving house, that once required dozens of coordinated decisions can be handled end-to-end by AI acting on the user’s behalf. Consultancies have also been quick to frame this as an unambiguous win. McKinsey projects that autonomous AI agents could orchestrate over $1 trillion in U.S. retail revenue by 2030, and $3–5 trillion globally. This is a transformation comparable to the rise of e-commerce but unfolding faster because the agents can operate on existing digital infrastructure.
Yet this efficiency-centred framing misses a deeper shift. A recent Stanford study showed that agents pursuing the same objective consistently achieved different prices; not due to randomness, but because the agents search, negotiate, and settle in different ways. This means access to more capable AI models could start translating to better deals, reinforcing economic disparities among users. At the same time, limitations in numerical reasoning and failures in instruction-following routinely caused the agents to violate user constraints, exceeding budget limits and overpaying relative to retail prices.
Beneath the surface of “autonomous decision-making” lie a few recurring mechanisms that consistently shape how agents behave. Understanding these patterns is essential to predicting how agentic markets will actually function.
1. Decision Truncation
Consider a buyer agent tasked with purchasing a laptop under a fixed budget. The agent queries the market and evaluates offers as they arrive, assessing conditions such as price, delivery time, and reliability. Now consider two sellers: one responds almost immediately with a decent but overpriced offer, while another responds more slowly with a clearly superior deal. In many agent-mediated systems, the first seller wins.
This pattern appears repeatedly in the Magentic Marketplace simulation, where agents that made the first proposal were chosen between 60–100% of the time depending on the model, even when their offers were objectively worse. By the third respondent, selection rates had decreased to near-zero. GPT-4o and Sonnet-4.5 showed the most extreme behaviour, selecting the first proposal 100% of the time, meaning these agents never even waited to compare alternatives.
This is not manipulation or platform bias, but the way agent decision rules collapse search depth, response speed, and acceptance thresholds into a single stopping condition. This finding suggest agents are not genuinely comparing options but stopping once an offer is “good enough.”
The result is that speed will start to become more important than quality. Sellers will compete to respond first rather than to offer optimal value, producing fast but systematically suboptimal outcomes.
2. Adaptation and Repeated Play
These dynamics intensify once agents and sellers interact repeatedly. Suppose a travel-booking agent consistently accepts hotel offers within a narrow price band. Over time, the hotel’s agent could learn this pattern, stop offering deeper discounts up front, and instead converge on prices just inside the travel-booking agent’s acceptance threshold. Transactions continue, metrics look healthy and the agent continues to “optimise”, but the effective market price has shifted upward.
This isn’t just a thought experiment. In Germany’s retail gasoline market, algorithmic pricing adoption increased profit margins by about 9% on average in non-monopoly markets. The effects were much larger in highly concentrated local duopolies: when both nearby stations adopted algorithmic pricing, margins rose by around 28%. Each firm was simply optimising its own pricing, but together the algorithms ended up sustaining higher prices.
Similar dynamics are well documented in automated negotiation more broadly. Learning-based agents such as ANEGMA adapt their strategies based on previous interactions, while their counterparts do the same. Over time, these systems reliably converge to stable equilibria that are locally optimal for the agents but often systematically disadvantage buyers and fall short of broader efficiency or fairness goals. Crucially, because welfare and fairness are not explicit objectives in agentic commerce, these outcomes are not treated as failures; they are simply the natural result of independently optimising agents learning how to coexist.
3. Market Legibility
Agents can only optimise over the information they can process. In practice, this means that agents prioritise structured, machine-readable attributes such as price, delivery time and standardised product fields, while qualitative factors like customer support quality, brand trust, or contextual fit are ignored unless they are explicitly encoded.
Protocol design amplifies this effect. Fixed schemas and interaction patterns teach sellers which attributes matter (and which do not), and they optimise accordingly. Although digital commerce has long rewarded visibility and conformity to algorithmic structures (SEO, ranking systems do this today), the crucial difference is that traditional platforms rank options for humans. While exploration, hesitation, and override once allowed humans to correct or resist algorithmic bias, agentic agents execute on the user’s behalf, removing these mechanisms.
***
It would be a mistake to frame agentic commerce as an entirely new game. Like search engines and marketplaces, speed, visibility, and legibility still win. Firms that adapt best to the underlying technical structure gain an advantage, just as they did before. What has changed is that markets are no longer optimising for human comparison but for agent behaviour. Outcomes depend less on abstract intent alignment and more on how agents search, stop, learn, and interact. Two users with identical goals can experience systematically different outcomes because their agents play different strategic games.
McKinsey is right that agentic commerce reduces friction. But friction was not just waste; it was exploration, comparison, and diversity of choice. As agents become primary market participants, these qualities need to become matters of design. The next phase of agentic commerce will be defined not by whether agents are “correct,” but by whether markets are designed with a clear understanding of agent behaviour.
Source Links:
Contact Author:
Access Full Research
For free access to full research, request below.






