
As the race to align artificial intelligence with human values accelerates, recent research suggests that the world may be looking at the problem too subjectively. Ethical alignment has traditionally meant training models to conform to predefined norms- usually grounded in Western liberal values like safety, impartiality, and harm avoidance. But two recent papers challenge this approach, offering powerful insight into how ethical alignment must evolve if AI is to serve a truly global, pluralistic world.
The first paper, How Ethical Should AI Be?, examines how alignment alters the behavior of large language models (LLMs) in high-stakes financial decision-making. Researchers found that models tuned for maximum ethical compliance – specifically for “helpfulness, harmlessness, and honesty” (HHH) – tend to become excessively risk-averse. In simulated investment scenarios, these models consistently underinvested, even when risk-taking was statistically optimal. Ironically, over-alignment made the models less effective at creating value. Once trained in this way, the bias toward caution persisted – even when prompted otherwise – suggesting that ethical fine-tuning is not just directional, but sticky and hard to reverse. Another paper Ethical Reasoning over Moral Alignment – takes aim at the assumption that models should be built with any fixed moral stance at all. Instead, it proposes that AI systems should develop the ability to reason across ethical contexts without being permanently bound to any one framework. The authors argue for an in-context approach, where ethical reasoning is activated at the application level, guided by the users or environments in which the model is deployed. To support this, they present a formal structure of ethical policies-ranging from abstract value hierarchies to concrete dilemma-specific rules – and show that LLM exhibits cultural bias toward Western moral values unless guided explicitly.
Taken together, these studies signal a necessary shift: from hardcoded alignment to contextual ethical reasoning. The future of trustworthy AI may not lie in universal moral conformity, but in the ability to understand, interpret, and act within diverse moral landscapes. Ethical intelligence, then, becomes not about being “safe” in all contexts, but about being adaptive, transparent, and situationally aware. For those building tomorrow’s AI systems, the lesson is clear: don’t just ask if your model is aligned – ask who it’s aligned with, when, and why.
Source Papers:
How Ethical Should AI Be?
Ethical Reasoning over Moral Alignment