Topic: MIND-Net: A Novel Learning Network
Speaker: Prof. S.Y. Kung, Princeton University
Date: July 4, at 9:00 AM
Venue: F706, New Main Building
The success of deep neural networks (DNN) hinges upon the rich nonlinear space embedded in their nonlinear hidden neuron layers. As to the weakness, the prevalent concerns over deep learning include two major fronts: one analytical and one structural. In this talk, we introduce a learning model called MIND-Net with a new learning paradigm to monotonically increase the discriminative power (quantified by DI) of the classifying networks. It offers a learning tool to efficiently tackle both the analytical and structural concerns over deep learning networks.
From the analytical perspective, the ad hoc nature of deep learning renders its success at the mercy of trial-and-errors. To rectify this problem, we advocate a methodic learning paradigm, MIND-Net, which is computationally efficient in training the networks and yet mathematically feasible to analyze. MIND-Net hinges upon the use of an effective optimization metric, called Discriminant Information (DI). It will be used as a surrogate of the popular metrics such as 0-1 loss or prediction accuracy. Mathematically, DI is equivalent or closely related to Gauss’ LSE, Fisher’s FDR, and Shannon’s Mutual Information. We shall explain why is that higher DI means higher linear separability, i.e., higher DI means that the data are more discriminable. In fact, it can be shown that, both theoretically and empirically, a high DI score usually implies a high prediction accuracy.
In the structural front, the curse of depth is widely recognized as a cause of serious concern. Fortunately, many solutions have been proposed to effectively combat or alleviate such a curse. Likewise, in our case, MIND-Net offers yet another cost-effective solution by circumventing the depth problem altogether via a new notion (or trick) of Omni-present Supervision (OS), i.e., teachers hidden a “Trojan-horse” being transported (along with the training data) from the input to each of the hidden layers. Opening up the Trojan-horse at any hidden-layer, we can have direct access to the teacher’s information for free, in the sense that no BP is incurred. In short, it amounts to learning with no-propagation. By harnessing the teacher information, we will be able to construct a new and slender “inheritance layer” to summarize all the discriminant information amassed by the previous layer. Moreover, by horizontally augmenting the inheritance layer with additional randomized nodes and applying back-propagation (BP) learning, the discriminant power of to the newly augmented network will be further enhanced.
In our experiments, the MIND-Net was applied to synthetic and real-world datasets, e.g., CIFAR-10 dataset based on feature extracted from different layers of ResNets. The results generally support our theoretical prediction and yield some performance improvements.
Biography of the Speaker:
S.Y. Kung, Life Fellow of IEEE, is a Professor at Department of Electrical Engineering in Princeton University. His research areas include machine learning, data mining, systematic design of (deep-learning) neural networks, statistical estimation, VLSI array processors, signal and multimedia information processing, and most recently compressive privacy. He was a founding member of several Technical Committees (TC) of the IEEE Signal Processing Society. He was elected to Fellow in 1988 and served as a Member of the Board of Governors of the IEEE Signal Processing Society (1989-1991). He was a recipient of IEEE Signal Processing Society's Technical Achievement Award for the contributions on "parallel processing and neural network algorithms for signal processing" (1992); a Distinguished Lecturer of IEEE Signal Processing Society (1994); a recipient of IEEE Signal Processing Society's Best Paper Award for his publication on principal component neural networks (1996); and a recipient of the IEEE Third Millennium Medal (2000). Since 1990, he has been the Editor-In-Chief of the Journal of VLSI Signal Processing Systems. He served as the first Associate Editor in VLSI Area (1984) and the first Associate Editor in Neural Network (1991) for the IEEE Transactions on Signal Processing. He has authored and co-authored more than 500 technical publications and numerous textbooks including VLSI Array Processors, Prentice-Hall (1988); Digital Neural Networks, Prentice-Hall (1993); Principal Component Neural Networks, John-Wiley (1996); Biometric Authentication: A Machine Learning Approach, Prentice-Hall (2004); and Kernel Methods and Machine Learning, Cambridge University Press (2014).
School of Electronic and Information Engineering