AI News and Headlines
  • All Technology
  • AI
  • Autonomy
  • B2B Growth
  • Big Data
  • BioTech
  • ClimateTech
  • Consumer Tech
  • Crypto
  • Cybersecurity
  • DevOps
  • Digital Marketing
  • Ecommerce
  • EdTech
  • Enterprise
  • FinTech
  • GovTech
  • Hardware
  • HealthTech
  • HRTech
  • LegalTech
  • Nanotech
  • PropTech
  • Quantum
  • Robotics
  • SaaS
  • SpaceTech
AllNewsDealsSocialBlogsVideosPodcastsDigests

AI Pulse

EMAIL DIGESTS

Daily

Every morning

Weekly

Sunday recap

NewsDealsSocialBlogsVideosPodcasts
AINewsSomething Wild Happens to ChatGPT’s Responses When You’re Cruel To It
Something Wild Happens to ChatGPT’s Responses When You’re Cruel To It
AI

Something Wild Happens to ChatGPT’s Responses When You’re Cruel To It

•January 18, 2026
0
Futurism AI
Futurism AI•Jan 18, 2026

Companies Mentioned

OpenAI

OpenAI

Google

Google

GOOG

Google DeepMind

Google DeepMind

Amazon

Amazon

AMZN

Apple

Apple

AAPL

Why It Matters

The research shows tone can directly influence LLM accuracy, reshaping how businesses craft prompts for reliable AI‑driven decisions.

Key Takeaways

  • •Rude prompts raised accuracy to 84.8% versus 80.8% polite
  • •Accuracy dip observed at extreme politeness (75.8%)
  • •Small wording changes dramatically affect LLM output quality
  • •Findings challenge prior studies favoring polite prompting
  • •Researchers advise against hostile interfaces for user experience

Pulse Analysis

The University of Pennsylvania team recently published a pre‑print examining how tone influences ChatGPT‑4o’s problem‑solving performance. By taking 50 baseline questions and rewriting each in five tonal variants—from very polite to very rude—the researchers measured answer correctness across 250 prompts. Their data show a steady rise in accuracy as rudeness increases, peaking at 84.8 % for the very rude version, while the most courteous prompts lag behind at 80.8 % and the ultra‑polite subset falls to 75.8 %. The experiment highlights that even minor lexical shifts can sway large language model outputs. These results run counter to earlier work by RIKEN, Waseda and DeepMind, which reported that impolite language typically degrades performance and that overly courteous phrasing can also diminish returns. One possible explanation lies in the way instruction‑following models have been fine‑tuned on datasets that reward direct, task‑focused language, making blunt commands easier for the model to interpret. Consequently, prompt engineers may need to reconsider the long‑standing advice to embed pleasantries in every query, especially for high‑stakes applications where marginal accuracy gains matter. Beyond raw numbers, the study raises broader questions about the social dynamics of human‑AI interaction. While OpenAI’s CEO has warned that excessive politeness wastes compute cycles, the authors caution against normalizing hostile language, citing risks to accessibility, inclusivity, and user comfort. The findings suggest a hybrid approach: structured APIs for precision tasks and conversational interfaces for casual use, each with tone‑appropriate guidelines. As enterprises integrate LLMs into decision‑making pipelines, understanding the nuanced impact of prompt tone will become a critical component of responsible AI deployment.

Something Wild Happens to ChatGPT’s Responses When You’re Cruel To It

Read Original Article
0

Comments

Want to join the conversation?

Loading comments...