Few-Shot Learning and AI automatio
Machine learning has been highly successful in data-intensive applications, but it is often hampered by the small data set, promoting the problem of few-shot learning. The common disadvantage of the current methods is that they rely on prior knowledge, limiting their broad applications. Deep neural networks (NNs) have achieved great success in many applications, such as computer vision, natural language processing, self-driving cars, protein folding, healthcare, finance, and security. However, the training and architecture design of NNs are still significant challenges for the requirement of much data and expensive computing time. Usually, NN training is carried out by BP with gradient information. But, there is no universal mechanism for error propagation during NN training of vastly different neural architectures. We develop a framework from network science and dynamical system perspectives to build a deep understanding of the training process for better neural architectures. We view a NN as a complex network with multiple layers and formulate the training process as some implicit nonlinear dynamics on the networks, like the ecological system modeled by Lotka–Volterra equations. The steady states of the equations associate with the optimal weights of the neural network. However, it is not easy to solve them due to nonlinearity, high-dimensionality, and uncertainty. Using the mean-field technique, we condense the coupled nonlinear equations into an effective single equation, with which we can still capture the key macroscopic features in the training process. The reduced system can be easily solved, and more importantly, it can produce some topological properties to characterize the predictive performance of NNs. These can be fully utilized for a direct and efficient design of NN architectures in a reduced space and robust few-shot learning with dramatically fewer data. Our approach, in essence, contributes an essential ingredient to AutoAI.
Jianxi Gao (PI)
Chunheng Jiang (PhD)
Kushal Bhandari (PhD)
Zhenhan Huang (Master)