Optimizing the Performance of Convolutional Neural Networks on Raspberry PI for Real-time Object Detection

Hyun Woo Jung, Hankuk Academy of Foreign Studies, Republic of Korea; Hyun Woo Jung, Hankuk Academy of Foreign Studies, Republic of Korea

Optimizing the Performance of Convolutional Neural Networks on Raspberry PI for Real-time Object Detection

Authors

Hyun Woo Jung, Hankuk Academy of Foreign Studies, Republic of Korea

Abstract

Deep learning has facilitated major advancements in various fields including image detection. This paper is an exploratory study on improving the performance of Convolutional Neural Network (CNN) models in environments with limited computing resources, such as the Raspberry Pi. A pretrained state-of-art algorithm for doing near-real time object detection in videos, YOLO (“You-Only-Look-Once”) CNN model, was selected for evaluating strategies for optimizing the runtime performance. Various performance analysis tools provided by the Linux kernel were used to measure CPU time and memory footprint. Our results show that loop parallelization, static compilation of weights, and flattening of convolution layers reduce the total runtime by 85% and reduce memory footprint by 53% on a Raspberry Pi 3 device. These findings suggest that the methodological improvements proposed in this work can reduce the computational overload of running CNN models on devices with limited computing resources.

Keywords

Deep Learning, Convolutional Neural Networks, Raspberry Pi, real-time object detection

CS&IT Conference Proceedings

Optimizing the Performance of Convolutional Neural Networks on Raspberry PI for Real-time Object Detection