Abstract Ensuring that AI systems do what we, as humans, actually want them to do is one of the biggest open research challenges in AI alignment and safety. My research seeks to directly address this challenge by enabling AI systems to interact with humans to learn aligned and robust behaviors. The way robots and other AI systems behave is often the result of optimizing a reward function. However, manually designing good reward functions is highly challenging and error‐prone, even for domain experts. Although reward functions are often difficult to manually specify, human feedback in the form of demonstrations or preferences is often much easier to obtain but can be difficult to interpret due to ambiguity and noise. Thus, it is critical that AI systems take into account epistemic uncertainty over the human's true intent. As part of the AAAI New Faculty Highlight Program, I will give an overview of my research progress along the following fundamental research areas: (1) efficiently quantifying uncertainty over human intent, (2) directly optimizing behavior to be robust to uncertainty over human intent, and (3) actively querying for additional human input to reduce uncertainty over human intent.
Building similarity graph...
Analyzing shared references across papers
Loading...
Daniel S. Brown
AI Magazine
University of Utah
Building similarity graph...
Analyzing shared references across papers
Loading...
Daniel S. Brown (Fri,) studied this question.
www.synapsesocial.com/papers/68c1d60654b1d3bfb60f9839 — DOI: https://doi.org/10.1002/aaai.70024