By Leonardo Rezende Juracy, School of Technology, Pontifical Catholic University of Rio Grande do Sul – PUCRS, Brazil, leonardo.juracy@edu.pucrs.br | Rafael Garibotti, School of Technology, Pontifical Catholic University of Rio Grande do Sul – PUCRS, Brazil, rafael.garibotti@pucrs.br | Fernando Gehm Moraes, School of Technology, Pontifical Catholic University of Rio Grande do Sul – PUCRS, Brazil, fernando.moraes@pucrs.br
Over the past decade, a massive proliferation of machine learning algorithms has emerged, from applications for surveillance to self-driving cars. The turning point occurred with the arrival of Convolutional Neural Network (CNN) models and the incredible accuracy brought by Deep Neural Networks (DNNs) at the cost of high computational complexity. In this growing environment, graphic processing units (GPUs) have become the de facto reference platform for the training and inference phases of CNNs and DNNs due to their high processing parallelism and memory bandwidth. However, GPUs are power-hungry architectures. To enable the deployment of CNN and DNN applications on energy-constrained devices (e.g., IoT devices), industry and academic research have moved towards hardware accelerators. Following the evolution of neural networks (from CNNs to DNNs), this survey sheds light on the impact of this architectural shift and discusses hardware accelerator trends in terms of design, exploration, simulation, and frameworks developed in both academia and industry.
The past decade has witnessed the consolidation of Artificial Intelligence technology, thanks to the popularization of Machine Learning (ML) models. The technological boom of ML models started in 2012 when the world was stunned by the record-breaking classification performance achieved by combining an ML model with a high computational performance graphic processing unit (GPU). Since then, ML models received ever-increasing attention, being applied in different areas such as computational vision, virtual reality, voice assistants, chatbots, and self-driving vehicles.
The most popular ML models are brain-inspired models such as Neural Networks (NNs), including Convolutional Neural Networks (CNNs) and, more recently, Deep Neural Networks (DNNs). They are characterized by resembling the human brain, performing data processing by mimicking synapses using thousands of interconnected neurons in a network.
In this growing environment, GPUs have become the de facto reference platform for the training and inference phases of CNNs and DNNs, due to their high processing parallelism and memory bandwidth. However, GPUs are power-hungry architectures. To enable the deployment of CNN and DNN applications on energy-constrained devices (e.g., IoT devices), industry and academic research have moved towards hardware accelerators. Following the evolution of neural networks from CNNs to DNNs, this monograph sheds light on the impact of this architectural shift and discusses hardware accelerator trends in terms of design, exploration, simulation, and frameworks developed in both academia and industry.