Credits: 3

Description

Prerequisites: Minimum grade of C- in ENEE324 or STAT400.

An introduction to techniques for efficient deep-learning computing that enable powerful AI on resource-constrained devices such as Raspberry Pi, Jetson Nano, microcontrollers, and edge TPUs. Through twice-weekly lectures and a weekly lab, students will learn to compress and optimize neural networks using model-compression techniques like pruning and quantization, neural-architecture search, knowledge distillation, and fine-tuning on-device. The course also covers advanced topics such as distributed and parallel training, hardware-accelerated inference for vision and point-cloud tasks, and application-specific acceleration for generative AI (diffusion models and large language models). Labs providehands-on experience training and deploying optimized models on embedded platforms, building practical skills in on-device computer vision, speech recognition, and language translation. The goal is to equip students with a thorough understanding of ML system design on edgehardware and to prepare them to build efficient IoT AI applications.

Semesters Offered

Fall 2026