Violence Detection Enhancement in Video Sequences Based on Pre-trained Deep Models

Elkhashab, Yahia Reda; H. El-Behaidy, Wessam

doi:10.21608/fcihib.2023.153245.1075

Violence Detection Enhancement in Video Sequences Based on Pre-trained Deep Models

نوع المستند : المقالة الأصلية

المؤلفون

¹ Computer Science and Artificial Intelligence, Computer Science department, Helwan university

² Computer Science Department, Faculty of Computers and Artificial Intelligence, Helwan University, Cairo, Egypt

10.21608/fcihib.2023.153245.1075

المستخلص

Violence detection is one of the challenging applications in the field of Human activity recognition (HAR). The enhancement of violence detection, through surveillance cameras, will help our societies to be more controllable and safer. In this paper, a proposed two-layer deep model is built for classifying video sequences into violent and non-violent actions. The first layer extracts the space features of video frames using the pre-trained model DenseNet-121. Then, the extracted features are fed to a long short-term memory (LSTM) network. LSTM captures the temporal features by learning the dependencies between frames, which links all frames of a video as one action. The proposed model is experimentally evaluated on two datasets. The recognition rate has improved up to 96%, which is better than those of most existing similar models over the open HOCKEY dataset and up to 92% over the real-live violence situations (RLVS) dataset. The implementation of the proposed model is available at: https://github.com/YahiaElkhasahb/Enhancing-Violence-Detection-in-Video-Sequences-Based-on-Deep-Learning-Techniques.

الكلمات الرئيسية