Ph.D. Dissertation Defense: Amit Kumar

Friday, October 11, 2019
3:00 p.m.
IRB 4107
Maria Hoo
301 405 3681
mch@umd.edu

ANNOUNCEMENT:  Ph.D. Dissertation Defense


Name: Amit Kumar   

Committee:
Professor Rama Chellappa, Chair/Advisor
Professor Behtash Babadi
Professor Larry Davis
Professor Vishal Patel
Professor Ramani Duraiswami, Dean’s Representative

Date/time: Friday, October 11th, 2019 at 3PM 

Location:  IRB 4107

Title: ROBUST FACIAL LANDMARKS LOCALIZATION WITH APPLICATIONS IN FACIAL BIOMETRICS


Abstract:  

Localization of regions of interest on images and videos is a well-studied problem in computer vision community. Usually localization tasks imply localization of objects in a given image, such as detection and segmentation of objects in images. However, the regions of interests can be limited to a single pixel as in the task of facial landmark localization or human pose estimation. This dissertation studies robust facial landmark detection algorithms for faces in the wild using learning methods based on Convolution Neural Networks. 

Detection of specific keypoints on face images is an integral pre-processing step in facial biometrics and numerous other applications including face verification and identification. Detecting keypoints allows to align face images to a canonical coordinate system using geometric transforms such as similarity or affine transformations
mitigating the adverse effects of rotation and scaling. This challenging problem has become more attractive in recent years as a result of advances in deep learning and release of more unconstrained datasets. The research community is pushing boundaries to achieve better and better performance on unconstrained images, where the
images are diverse in pose, expression and lightning conditions.

Over the years, researchers have developed various hand-crafted techniques to extract meaningful features from features, most of them being appearance and geometry-based features. However, these features do not perform well for data collected in unconstrained settings due to large variations in appearance and other nuisance factors. Convolution Neural Networks (CNNs) have become prominent because of their ability to extract discriminating features. Unlike the hand-crafted features, DCNNs perform feature extraction and feature classification from the data itself in an end-to-end fashion. This enables the DCNNs to be robust to variations present in the data and at the same time improve their discriminative ability.

In this dissertation, we discuss three different methods for facial keypoint detection based on Convolution Neural Networks. The methods are generic and can be
extended to a related problem of keypoint detection for human pose estimation. The first method called Cascaded Local Deep Descriptor Regression uses deep features extracted around local points to learn linear regressors for incrementally correcting the initial estimate of the keypoints. In the second method, called KEPLER, we develop efficient Heatmap CNNs to directly learn the non-linear mapping between the input and target spaces. We also apply different regularization techniques to tackle the effects of imbalanced data and vanishing gradients. In the third method, we model the spatial correlation between different keypoints using Pose Conditioned Convolution Deconvolution Networks (PCD-CNN) while at the same time making it pose agnostic by disentangling pose from the face image. Next, we show an application of facial landmark localization used to align the face images for the task of apparent age estimation of humans from unconstrained images. 

In the fourth part of this dissertation we discuss the impact of good quality landmarks on the task of face verification. Previously proposed methods perform with reasonable accuracy on high resolution and good quality images but fail when the input image suffers from degradation. To this end, we propose a semi-supervised method which aims at predicting landmarks in the low-quality images. This method learns to predict landmarks in low resolution images by learning to model the learning process of high-resolution images. In this algorithm, we use Generative Adversarial Networks, which first learn to model the distribution of real low-resolution images after which another CNN learns to model the distribution of heatmaps on the images. Additionally, we also propose another high-quality facial landmark detection method, which is currently state of the art.

Finally, we also discuss the extension of ideas developed for facial keypoint localization for the task of human pose estimation, which is one of the important cues for Human Activity Recognition. As in PCD-CNN, the parts of human body can also be modelled in a tree structure, where the relationship between these parts are learnt through convolutions while being conditioned on the 3D pose and orientation. Another interesting avenue for research is extending facial landmark localization to naturally degraded images.

 

Audience: Graduate  Faculty 

 

November 2019

SU MO TU WE TH FR SA
27 28 29 30 31 1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
1 2 3 4 5 6 7
Submit an Event