now publishers - Convolutional Neural Networks Inference Memory Optimization with Receptive Field-Based Input Tiling

APSIPA Transactions on Signal and Information Processing > Vol 12 > Issue 1

Convolutional Neural Networks Inference Memory Optimization with Receptive Field-Based Input Tiling

Weihao Zhuang, Kobe University, Japan, zhuangweihao@stu.kobe-u.ac.jp , Tristan Hascoet, Kobe University, Japan, Xunquan Chen, Kobe University, Japan, Ryoichi Takashima, Kobe University, Japan, Tetsuya Takiguchi, Kobe University, Japan, Yasuo Ariki, Kobe University, Japan

Suggested Citation

Weihao Zhuang, Tristan Hascoet, Xunquan Chen, Ryoichi Takashima, Tetsuya Takiguchi and Yasuo Ariki (2023), "Convolutional Neural Networks Inference Memory Optimization with Receptive Field-Based Input Tiling", APSIPA Transactions on Signal and Information Processing: Vol. 12: No. 1, e3. http://dx.doi.org/10.1561/116.00000015

Publication Date: 18 Jan 2023

Subjects

Keywords

Convolutional neural network, memory optimization, receptive field

Journal details

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 1303 times

In this article:

Abstract

Currently, deep learning plays an indispensable role in many fields, including computer vision, natural language processing, and speech recognition. Convolutional Neural Networks (CNNs) have demonstrated excellent performance in computer vision tasks thanks to their powerful feature-extraction capability. However, as the larger models have shown higher accuracy, recent developments have led to state-of-the-art CNN models with increasing resource consumption. This paper investigates a conceptual approach to reduce the memory consumption of CNN inference. Our method consists of processing the input image in a sequence of carefully designed tiles within the lower subnetwork of the CNN, so as to minimize its peak memory consumption, while keeping the end-to-end computation unchanged. This method introduces a trade-off between memory consumption and computations, which is particularly suitable for high-resolution inputs. Our experimental results show that MobileNetV2 memory consumption can be reduced by up to 5.3 times with our proposed method. For ResNet50, one of the most commonly used CNN models in computer vision tasks, memory can be optimized by up to 2.3 times.

DOI:10.1561/116.00000015

Introduction
The Proposed Method
Computation vs. Memory Trade-off
Results and Discussion
Conclusion
References

Convolutional Neural Networks Inference Memory Optimization with Receptive Field-Based Input Tiling

Share

Journal details

Abstract