Deep Learning-based Weakly Supervised Video Anomaly Detection Methods for Smart City Applications
Аннотация
Spatio-temporal localization of the abnormal patterns in the video is known as video anomaly detection. Video anomaly detection is the most essential building block of any advanced video surveillance-based security application system. Annotation of the normal and anomalous videos at frame level is a tedious, time-consuming, and erroneous task. Hence, recently, weakly supervised video anomaly detection (WSVAD) methods, which use weakly labeled trained videos (or video-level annotations), have been proposed. This paper attempts to investigate and implement eight key state-of-the-art (SOTA) WSVAD methods on two publicly available benchmarked video anomaly datasets such as UCF crime and ShanghaiTech. The eight SOTA WSVAD implemented methods are Multiple Instance Learning (MIL), Robust Temporal Feature Magnitude (RTFM), Anomaly Regression Net (AR-Net), Bi-directional Encoder Representations from Transformers (BERT), Magnitude-Contrastive Glance-and-Focus Network (MGFN), Temporal Self-Attention (TSA), Weakly Supervised Anomaly Localization (WSAL), Prompt-Enhanced Learning (PEL) and Temporal Context Aggregation (TCA). Subsequently, a comparative analysis of these implemented WSVAD methods is carried out to draw some insightful conclusions.
Перевод пока недоступен