The Frontier of Deep Learning in Computer Vision
Release time：December 19, 2016 /
Deep supervision for deep learning
Speaker: Prof. Tu (Zhuowen Tu) from the Department of Brain and Cognitive Sciences, UCSD
Date: 23th, Dec, 2016 (Friday), 3:00-4:00 PM
Venue: ROOM 416, Yifu Building
Key words: computer vision, machine learning，deep learning，neural computation
Brief Intro of the speaker:
Zhuowen Tu, who works as an associate professor, a doctoral supervisor in the department of brain and cognitive sciences, UCSD，is conceived as a top expert in the field of international machine vision.
In this talk, the motivation and the benefits introducing of deep supervision to deep learning, convolutional neural networks in particular, will be discussed. We will then focus on our recent work, holistically-nested edge detection algorithm (HED) that tackles two important issues in this long-standing vision problem: (1) holistic image training and prediction; and (2) multi-scale and multi-level feature learning. HED performs image-to-image prediction by means of a deep learning model that leverages fully convolutional neural networks and deeply-supervised nets and it automatically learns rich hierarchical representations (guided by deep supervision on side responses) that are important in order to resolve the challenging ambiguity in edge and object boundary detection. We significantly advance the state-of-the-art on the general edge/boundary detection task to reach human-level performance in the first time in the literature (to the best of our knowledge).
In the second half of the talk, we will present a recent top-down learning method that learns priors of structured labels, convolutional pseudoprior (ConvPP). Compared with classical machine learning algorithms like CRFs and Structural SVM, ConvPP automatically learns rich convolutional kernels to capture both short- and long- range contexts; compared with cascade classifiers like Auto-Context, ConvPP avoids the iterative steps of learning a series of discriminative classifiers and automatically learns contextual configurations; compared with recent efforts combing CNN models with CRFs and RNNs, ConvPP learns convolution in the labeling space with much improved modeling capability and less manual specification; compared with Bayesian models like MRFs, ConvPP capitalizes on the rich representation power of convolution by automatically learning priors built on convolutional filters. We accomplish our task using pseudo-likelihood approximation to the prior under a novel fixed-point network structure that facilitates an end-to-end learning process.