Researchers Built a Tiny Economy. AIs Broke It Immediately

•December 14, 2025

0

Two Minute Papers

Two Minute Papers•Dec 14, 2025

Why It Matters

The study shows that advanced language models can mimic human economic decision‑making, warning policymakers and developers that AI‑driven marketplaces may inherit the same inefficiencies and risks as traditional markets, necessitating careful alignment and regulation.

Summary

The research team behind SimWorld unveiled a procedurally generated video‑game city populated by autonomous agents—vehicles, robots and humans—each powered by leading large language models such as ChatGPT, Gemini, DeepSeek, Claude and a legacy GPT‑4‑mini. The experiment tasked these agents with running a delivery economy: bidding for orders, managing fatigue, investing in upgrades like scooters, and choosing between cooperation and competition. By observing the emergent market dynamics, the researchers aimed to see whether AI‑driven actors would behave like humans in a complex economic setting.

The results highlighted stark contrasts in strategy and performance. Greedy, high‑risk agents such as DeepSeek and Claude amassed the largest profits—nearly 70 units—but with extreme volatility, while Gemini pursued a steadier, more measured approach, earning about 42 units with far less variance. In a striking failure, the older GPT‑4‑mini earned nothing, apparently unable to grasp the game’s rules. Moreover, a price‑war emerged as undercutting agents like DeepSeek and Quen consistently bid below market rates to secure contracts, whereas ChatGPT refused to lower its bids and lost out entirely.

Personality profiling of the agents revealed that traits borrowed from the Big Five psychology model had tangible economic consequences. Agents high in openness chased novel upgrades and speculative bidding strategies, often overspending on unused scooters and going broke. By contrast, conscientious agents ignored flashy options, focused on order fulfillment, and outperformed their peers. Low agreeableness correlated with refusal to accept work, while high conscientiousness predicted reliable order completion. When the market was flooded with orders, agents paradoxically became lazier, opting for “do‑nothing” actions instead of hustling for profit.

These findings suggest that large language models, when embedded in simulated economies, reproduce many human‑like market behaviors—risk‑seeking, price competition, over‑exploration, and inertia. The experiment offers a low‑cost sandbox for studying multi‑agent economic dynamics and underscores the importance of designing AI systems that can navigate real‑world financial ecosystems without succumbing to the same pitfalls that plague human actors.

Original Description

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers

Using DeepSeek on Lambda:

https://lambda.ai/inference-models/deepseek-r1

📝 The paper is available here:

https://simworld.org/

📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD

Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras Bobrovytsky, Tybie Fitzhugh, Ueli Gallizzi

If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

My research: https://cg.tuwien.ac.at/~zsolnai/

X/Twitter: https://twitter.com/twominutepapers

Thumbnail design: Felícia Zsolnai-Fehér - http://felicia.hu

0

Comments

Want to join the conversation?

Loading comments...