The Gradient Podcast
Yasen Gabriel frames AI ethics on a human‑rights footing, arguing that value alignment must start from the simple premise that all human lives hold equal worth. From this bedrock he builds a pluralistic approach that seeks to encode fairness across cultures, religions, and political systems. By treating value alignment as a world‑making activity, his work connects abstract moral philosophy to concrete AI governance challenges, positioning distributive justice and civil discourse at the heart of advanced system design.
A central tension explored in the episode is the balance between democratic civility and agonistic pluralism. Gabriel draws on Chantal Mouffe’s contestation model while insisting that fair procedural mechanisms can generate overlapping consensus—agreements that diverse stakeholders recognize as legitimate, even if they do not share identical principles. He illustrates this with real‑world AI deployments, such as hospital resource‑allocation tools, where deliberative assemblies negotiate trade‑offs between efficiency and waiting times. This flexible, context‑sensitive alignment strategy moves beyond static, theoretical frameworks toward adaptive governance that respects local values.
The conversation also delves into Rawlsian concepts like the veil of ignorance, showing how experimental work reveals that once people commit to fair principles, they tend to honor them despite self‑interest. Gabriel’s research on speech‑act theory further highlights the need to accommodate oral cultures and varied discourse forms when aligning language models. Together, these insights suggest that robust AI ethics requires a blend of philosophical rigor, participatory design, and cultural humility, offering a roadmap for policymakers and technologists aiming to embed justice into the next generation of intelligent systems.
Episode 143
I spoke with Iason Gabriel about:
Value alignment
Technology and worldmaking
How AI systems affect individuals and the social world
Iason is a philosopher and Senior Staff Research Scientist at Google DeepMind. His work focuses on the ethics of artificial intelligence, including questions about AI value alignment, distributive justice, language ethics and human rights.
You can find him on his website and Twitter/X.
Find me on Twitter (or LinkedIn if you want…) for updates, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions.
Outline
(00:00) Intro
(01:18) Iason’s intellectual development
(04:28) Aligning language models with human values, democratic civility and agonism
(08:20) Overlapping consensus, differing norms, procedures for identifying norms
(13:27) Rawls’ theory of justice, the justificatory and stability problems
(19:18) Aligning LLMs and cooperation, speech acts, justification and discourse norms, literacy
(23:45) Actor Network Theory and alignment
(27:25) Value alignment and Iason’s starting points
(33:10) The Ethics of Advanced AI Assistants, AI’s impacts on social processes and users, personalization
(37:50) AGI systems and social power
(39:00) Displays of care and compassion, Machine Love (Joel Lehman)
(41:30) Virtue ethics, morality and language, virtue in AI systems vs. MacIntyre’s conception in After Virtue
(45:00) The Challenge of Value Alignment
(45:25) Technologists as worldmakers
(51:30) Technological determinism, collective action problems
(55:25) Iason’s goals with his work
(58:32) Outro
Links
Papers:
AI, Values, and Alignment (2020)
Aligning LMs with Human Values (2023)
Toward a Theory of Justice for AI (2023)
The Ethics of Advanced AI Assistants (2024)
A matter of principle? AI alignment as the fair treatment of claims (2025)
Get full access to The Gradient at thegradientpub.substack.com/subscribe
Comments
Want to join the conversation?
Loading comments...