Research

Current Research: Using Alignment to Improve Reward Design in RL

Past Research: Improving Preference-based RL Algorithms

Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning

In this work, we propose a simple technique that utilizes reward-free, low-quality data to boost the performance of off-the-shelf preference-based RL algorithms. Most importantly, we validate our approach with a human-user study.

Published at ICLR 2025.

In this paper, we tackle the problem of learning reward models from human preferences in extremely noisy environments. We found that by leveraging principles of dynamic sparse training, reward models can effectively learn to focus on task-relevant features.

Published at AAMAS 2025 (Extended Abstract).

Publications

You can find my articles on my Google Scholar profile.