Training Simplification and Model Simplification for Deep Learning:
A Minimal Effort Back Propagation Method
We propose a simple yet effective technique to simplify the training and the resulting model of neural networks. In back propagation, only a small subset of the full gradient is computed to update the model parameters. The gradient vectors are sparsified in such a way that only the top-$k$ elements (in terms of magnitude) are kept. As a result, only $k$ rows or columns (depending on the layout) of the weight matrix are modified, leading to a linear reduction in the computational cost. Based on the sparsified gradients, we further simplify the model by eliminating the rows or columns that are seldom updated, which will reduce the computational cost both in the training and decoding, and potentially accelerate decoding in real-world applications. Surprisingly, experimental results demonstrate that most of time we only need to update fewer than 5\% of the weights at each back propagation pass. More interestingly, the accuracy of the resulting models is actually improved rather than degraded, and a detailed analysis is given. The model simplification results show that we could adaptively simplify the model which could often be reduced by around 9x, without any loss on accuracy or even with improved accuracy.
About Dr. Xu Sun
Xu Sun is Associate Professor in Department of Computer Science, Peking University, since 2012. He got Ph.D from The University of Tokyo (2010), M.S. from Peking University (2007), and B.E. from Huazhong Univ. of Sci. & Tech. (2004). From 2010 to 2012, he worked at The University of Tokyo, Cornell University, and The Hong Kong Polytechnic University as Research Fellow/Associate. His research focuses on natural language processing and machine learning, especially on structured learning for natural language processing, and intelligent natural language generation. He has been Area Chair/Co-Chair of EMNLP 2015, IJCNLP 2017, etc.; PC member of ACL, IJCAI, AAAI, COLING, EMNLP, NAACL, etc.; Journal reviewer of IEEE TPAMI, Comput. Linguist., TACL, and so on. He is the awardee of Project of Thousand Youth Talents, the Organization Department of the Central Committee of the CPC, China, 2014. He is the recipient of Qiu Shi Outstanding Young Scholar Award, Qiu Shi Foundation, Hongkong, China, 2015, and the recipient of Okawa Foundation Research Grant Award, Okawa Foundation, Tokyo, Japan, 2016.