Event
M.S. Thesis Defense: Ryan Tse
Monday, December 2, 2024
12:00 p.m.
AVW 2460
Maria Hoo
301 405 3681
mch@umd.edu
ANNOUNCEMENT: M.S. Thesis Defense
Name: Ryan Tse
Committee:
Professor Kaiqing Zhang, Chair
Professor Shuvra S. Bhattacharyya
Professor Dinesh Manocha
Date/Time: Monday, December 2, 2024 from 12:00PM to 1:30PM
Location: AVW 2460
Title: Representation Learning for Reinforcement Learning: Modeling Non-Gaussian Transition Probabilities with a Wasserstein Critic
Abstract:
Reinforcement learning algorithms depend on effective state representations when solving complex, high-dimensional environments. Recent methods learn state representations using auxiliary objectives that aim to capture relationships between states that are behaviorally similar, meaning states that lead to similar future outcomes under optimal policies. These methods learn explicit probabilistic state transition models and compute distributional distances between state transition probabilities as part of their measure of behavioral similarity.
This thesis presents a novel extension to several of these methods that directly learns the 1-Wasserstein distance between state transition distributions by exploiting the Kantorovich-Rubenstein duality. This method eliminates parametric assumptions about the state transition probabilities while providing a smoother estimator of distributional distances. Empirical evaluation demonstrates improved sample efficiency over some of the original methods and a modest increase in computational cost per sample. The results establish that relaxing theoretical assumptions about state transition modeling leads to more flexible and robust representation learning while maintaining strong performance characteristics.
This thesis presents a novel extension to several of these methods that directly learns the 1-Wasserstein distance between state transition distributions by exploiting the Kantorovich-Rubenstein duality. This method eliminates parametric assumptions about the state transition probabilities while providing a smoother estimator of distributional distances. Empirical evaluation demonstrates improved sample efficiency over some of the original methods and a modest increase in computational cost per sample. The results establish that relaxing theoretical assumptions about state transition modeling leads to more flexible and robust representation learning while maintaining strong performance characteristics.