Ph.D. Dissertation Defense - Ruiqi Xian

Friday, April 3, 2026
3:30 p.m.
AVW 1146

Name: Ruiqi Xian
 
Committee:
Professor Dinesh Manocha (Chair)
Professor Shuvra S. Bhattacharyya
Professor Pratap Tokekar

Professor Michael Wilson Otte

Professor Mumu Xu (Dean's Representative)


Date/Time: Friday, April 3, 2026 at 03.30 PM - 05.30 PM
 
Location: AVW 1146

Zoom Link:  https://umd.zoom.us/j/8634660155
 

Title: Spatial-Temporal Representation Learning For Aerial Perception
 
Abstract: Aerial perception plays a central role in autonomous systems that rely on aerial, overhead, or elevated-view observations for tasks such as video understanding, localization, and multi-camera analysis. Compared with conventional ground-view perception, these settings introduce distinctive challenges, including severe viewpoint and scale variation, cluttered backgrounds, spatially sparse task-relevant foreground cues, temporal ambiguity, and substantial discrepancies across views, modalities, and camera networks. These characteristics make it difficult to directly apply standard visual learning methods developed for ground-view imagery and videos. In this dissertation, we address these challenges through the lens of representation learning for aerial perception, with a particular focus on how spatial structure, temporal dynamics, and cross-view consistency can be modeled more effectively.


Our central thesis is that effective aerial perception depends on representations that better capture action-relevant spatial cues, model temporal relationships more reliably, remain practical under real-world deployment constraints, and transfer across heterogeneous sensing conditions. Guided by this perspective, we develop a sequence of methods organized around three themes. First, we study refined spatial-temporal representation learning for aerial video understanding, where we improve how aerial video representations capture spatially sparse action-relevant regions and temporally ambiguous motion under large background dominance and weak foreground visibility. In this theme, we develop methods for temporal feature alignment, informative frame sampling, and semantic guidance. Across representative aerial and video recognition benchmarks, our methods improve recognition accuracy by 2.2%--13.8% on UAV-Human, 6.8% on NEC-Drone, and 9.0% on Diving48.

Second, we study practical and scalable aerial perception under real-world constraints, where we develop aerial-specific methods that improve efficiency, reduce annotation dependence, and remain suitable for realistic deployment. In this theme, our deployment-oriented methods improve top-1 accuracy by 6.1%--7.4% on RoCoG-v2, outperform prior state of the art by 8.3%--10.4% on UAV-Human, and reach 95.9% accuracy on Drone Action, while also achieving 2x faster inference on an RB5 CPU. Complementing this, our self-supervised learning methods further improve top-1 accuracy by 2.9% on NEC-Drone and 5.8% on UAV-Human under limited annotation, while achieving 2x-5x faster inference than supervised approaches with heavy test-time augmentation.

Third, we study transferable aerial perception across views, modalities, and camera networks, where we learn representations that preserve task-relevant consistency across aerial-ground correspondence and multi-camera settings. In this context, our methods enable more robust aerial-ground localization under cross-view, cross-modal, and scale discrepancies, and support calibration-free multi-camera association without requiring camera calibration or manual identity annotations.

Taken together, our contributions establish a unified perspective on aerial perception through representation learning that is spatially aware, temporally effective, practically deployable, and transferable across sensing conditions. We show that improving aerial perception is not only a matter of designing stronger task-specific predictors, but of learning representations that better reflect the spatial structure, temporal dynamics, and sensing variability of aerial and elevated-view observations.

Audience: Graduate  Faculty 

remind we with google calendar

 

March 2026

SU MO TU WE TH FR SA
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 1 2 3 4
Submit an Event