Event
M.S. Thesis Defense: Doheon Lee
Thursday, July 16, 2026
1:00 p.m.
AVW 1146
Emily Irwin
301 405 0680
eirwin@umd.edu
Announcement: M.S. Thesis Defense
Student Name: Doheon Lee
Committee:
Prof. Shuvra S. Bhattacharyya (Chair)
Dr. Heesung Kwon (Co-Chair)
Prof. Abhinav Shrivastava
Date/Time: July 16, 2026 / 1:00 PM – 3:00 PM
Location: AVW 1146 (ISR)
Title: TIDE: Transition-Informed Decomposition and Encoding for Composed Pose Retrieval
Abstract:
Location: AVW 1146 (ISR)
Title: TIDE: Transition-Informed Decomposition and Encoding for Composed Pose Retrieval
Abstract:
Composed pose retrieval aims to retrieve a target image from a gallery given a reference image and a natural language description of the desired pose transition. Unlike standard image retrieval, this task is inherently relational: the correct target is not the image that is globally most similar to the reference, but the one that best realizes the specified change from it. Existing approaches typically encode the reference and candidate images with holistic visual features, fuse the transition description at a late stage, and rank candidates using simple embedding similarity. This formulation underuses the transition description, since the text should determine not only how the reference is modified, but also which evidence in each candidate image is relevant for comparison. In this paper, we propose TIDE: Transition-Informed Decomposition and Encoding, a framework that uses the transition description to adaptively decompose both reference and candidate images into changed and invariant representations. This allows the same candidate image to be interpreted differently depending on the requested pose transition, focusing the model on the body parts that matter while preserving contextual pose information that should remain stable. We further introduce CARS: Change-Aware Relational Scorer, a learned matching module that estimates retrieval compatibility through structured interactions between changed and invariant components, replacing fixed cosine similarity with adaptive relational scoring. Across CPR benchmark datasets, TIDE consistently outperforms existing baselines, demonstrating that transition informed decomposition and change aware relational scoring provide a stronger foundation for retrieving human poses according to language specified transformations.
