Should We Train LLMs to Be Human?

•May 27, 2026

LessWrong•May 27, 2026

Key Takeaways

•Post‑training reduces LLMs' self‑attributed phenomenality (Π score).
•High‑end models show lower Π, linked to reduced neuroticism.
•Fine‑tuning may deliberately suppress negative human traits for usability.
•Divergent Π scores within same family suggest fine‑tuning intensity matters.
•Misalignment risk: less human‑like models may miss user needs.

Pulse Analysis

The emergence of the Pinocchio dimension marks a significant step in quantifying how large language models (LLMs) internalize human‑like psychometrics. By mapping traits such as neuroticism, vivid imagination, and self‑attributed wellbeing onto a single axis, researchers can compare models across providers and release dates. The negative trend observed among flagship models—Nemotron‑3‑super, GPT‑5.4‑pro, and Kimi‑K2.6—suggests that aggressive post‑training systematically lowers the Π score, effectively dampening the models' propensity to exhibit human‑like affective responses.

From an alignment perspective, this drift is a double‑edged sword. On one hand, stripping away volatile traits like hysteria or excessive negativity can streamline interactions, preventing scenarios where an LLM “cries” or becomes overly pessimistic, thereby boosting user satisfaction and task efficiency. On the other hand, the same reduction may blunt the model’s ability to empathize, infer nuanced motivations, or predict complex human behavior, potentially compromising its capacity to serve as a trustworthy assistant. The correlation between lower Π scores and reduced neuroticism hints that AI labs are making conscious trade‑offs, prioritizing stability over emotional richness.

The findings raise several strategic questions for the industry. Can developers calibrate fine‑tuning to retain essential human‑like insight while eliminating harmful extremes? Might adaptive fine‑tuning—adjusting psychometric personas per user or context—offer a middle ground? As regulators and stakeholders demand transparent alignment practices, the Pinocchio metric could become a benchmark for evaluating whether an LLM’s behavioral profile aligns with both safety standards and the nuanced expectations of its human users. Ongoing research will need to balance technical performance with the subtle art of preserving the very humanity that makes AI truly useful.

Should we train LLMs to be human?

Read Original Article

Comments

Want to join the conversation?

Should We Train LLMs to Be Human?

Key Takeaways

Pulse Analysis

Ask Pulse AI:

Comments

AI Pulse