What type of study is this?

September 10, 2025Open Access

Toward robust, interactive, and human‐aligned AI systems

Key Points

The research optimizes AI behavior to align with human intent, enhancing robustness and interaction.
Human feedback, though easier to obtain, introduces complexity due to ambiguity, impacting AI learning.
A focus on quantifying uncertainty over human intent is crucial to improving AI alignment and safety.
Active querying for additional human input helps reduce uncertainty, ultimately aiding in robust AI development.

Abstract

Abstract Ensuring that AI systems do what we, as humans, actually want them to do is one of the biggest open research challenges in AI alignment and safety. My research seeks to directly address this challenge by enabling AI systems to interact with humans to learn aligned and robust behaviors. The way robots and other AI systems behave is often the result of optimizing a reward function. However, manually designing good reward functions is highly challenging and error‐prone, even for domain experts. Although reward functions are often difficult to manually specify, human feedback in the form of demonstrations or preferences is often much easier to obtain but can be difficult to interpret due to ambiguity and noise. Thus, it is critical that AI systems take into account epistemic uncertainty over the human's true intent. As part of the AAAI New Faculty Highlight Program, I will give an overview of my research progress along the following fundamental research areas: (1) efficiently quantifying uncertainty over human intent, (2) directly optimizing behavior to be robust to uncertainty over human intent, and (3) actively querying for additional human input to reduce uncertainty over human intent.

Read Full Paperexternally

Demander à l'IA

Bookmark

View Full Paper