Ultra-High-Performance and Energy-Efficient Deep Learning for Resource-Constrained Devices

Date: 2020/11/09 - 2020/11/09

Academic Seminar: Ultra-High-Performance and Energy-Efficient Deep Learning for Resource-Constrained Devices

Speaker: Dr. Ao Ren, Assistant Professor in the Department of Electrical and Computer Engineering at Clemson University, Clemson, SC, USA

Time: 9:00 - 10:00, November 9th, 2020 (Beijing Time)

Location: via Zoom (Meeting ID: 67535776816, Password: 6171)

Abstract

Recent successes of deep learning in various domains rely on the excessive provision of weight parameters and the resulting large number of computations, which consume tremendous memory footprint and energy. As a result, high-quality deep learning models can hardly be deployed onto resource-constrained devices, such as mobile devices and IoTs. In this talk, I will discuss our representative works in addressing this problem. I will discuss our ADMM-NN compression framework, which has achieved the highest compression ratios on diverse representative DNN models. For instance, the parameter reduction on ResNet-50 is 25×, and that on LeNet-5 is 2,000×. Then, I will describe our DARB algorithm, which aims to overcome the limitations brought by conventional irregular pruning. It has significantly outperformed prior structured pruning methods by 4×~9× on multiple RNN models. Besides DNN model compression, we also study the emerging computing technologies for next-generation AI systems. I will describe our efforts in proposing the first stochastic computing and AQFP superconducting technology based DNN accelerator, which has achieved the highest energy efficiency compared to the existing DNN accelerators. Finally, I will conclude my talk with my on-going research and plan towards energy-efficient AI.

Biography

Dr. Ao Ren is currently an Assistant Professor in the Department of Electrical and Computer Engineering at Clemson University, Clemson, SC, USA. He obtained his Ph.D. degree from Northeastern University, Boston, MA, USA. His research centers around high-performance and energy-efficient deep learning systems, which include the studies of DNN model compression algorithms, DNN accelerator architecture, chip design, and emerging computing technologies. His work has appeared in some of the top venues in computer architecture and artificial intelligence, such as ASPLOS, ISCA, AAAI, ISSCC, and ICCAD.