Research
Current Research: Using Alignment to Improve Reward Design in RL
Past Research: Improving Preference-based RL Algorithms
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
In this work, we propose a simple technique that utilizes reward-free, low-quality data to boost the performance of off-the-shelf preference-based RL algorithms. Most importantly, we validate our approach with a human-user study.
Published at ICLR 2025.
Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity
In this paper, we tackle the problem of learning reward models from human preferences in extremely noisy environments. We found that by leveraging principles of dynamic sparse training, reward models can effectively learn to focus on task-relevant features.
Published at AAMAS 2025 (Extended Abstract).