Knowledge Agora



Scientific Article details

Title Efficiently adapting large pre-trained models for real-time violence recognition in smart city surveillance
ID_Doc 38939
Authors Ren, XH; Fan, WZ; Wang, YH
Title Efficiently adapting large pre-trained models for real-time violence recognition in smart city surveillance
Year 2024
Published Journal Of Real-Time Image Processing, 21, 4
DOI 10.1007/s11554-024-01486-w
Abstract Recently, the concept of smart cities has gained prominence, aiming to enhance urban efficiency, safety, and quality of life through advanced technologies. A critical component of this infrastructure is the extensive use of surveillance systems to monitor public spaces for violent behavior detection. As the scale of data and models grows, large-scale pre-trained models demonstrate remarkable capabilities across a wide range of applications. However, adapting these models for violence recognition in surveillance videos poses several challenges, including the fine-tuning cost, lack of temporal modeling, and inference overhead. In this paper, we propose an efficient recognition framework to adapt pre-trained models for violence behavior recognition, which consists of two paths, named spatial path and motion path. Our proposed framework allows for real-time parameter updating and real-time inference, which is adaptable to various ViT-based pre-trained models. Both paths adopt the pipeline of parameter-efficient fine-tuning to ensure the real-time performance of the model updating. What's more, within the motion path, as multiple frames need to be processed to capture temporal features, the real-time performance of the model is a challenge. Considering this, to improve the efficiency of inference, we compress multiple frames into the size of a single standard image, ensuring the real-time performance of inference. Experiments on five datasets demonstrate that our framework achieves state-of-the-art performance, efficiently transferring pre-trained large models to violence behavior recognition.
Author Keywords Smart city; Surveillance video; Real-time violence recognition; Large pre-trained model; Parameter-efficient fine-tuning
Index Keywords Index Keywords
Document Type Other
Open Access Open Access
Source Science Citation Index Expanded (SCI-EXPANDED)
EID WOS:001248059400002
WoS Category Computer Science, Artificial Intelligence; Engineering, Electrical & Electronic; Imaging Science & Photographic Technology
Research Area Computer Science; Engineering; Imaging Science & Photographic Technology
PDF
Similar atricles
Scroll