Abstract
With the massive deployment of distributed video surveillance systems, the automatic detection of abnormal events in video streams has become an urgent need. An abnormal event can be considered as a deviation from the regular scene; however, the distribution of normal and abnormal events is severely imbalanced, since the abnormal events do not frequently occur. To make use of a large number of video surveillance videos of regular scenes, we propose a semi-supervised learning scheme, which only uses the data that contains the ordinary scenes. The proposed model has a two-stream structure that is composed of the appearance and motion streams. For each stream, a recurrent variational autoencoder can model the probabilistic distribution of the normal data in a semi-supervised learning scheme. The appearance and motion features from the two streams can provide complementary information to describe this probabilistic distribution. Comprehensive experiments validate the effectiveness of our proposed scheme on several public benchmark data sets which include the Avenue, the Ped1, the Ped2, the Subway-entry, and the Subway-exit.
Original language | English |
---|---|
Article number | 8543857 |
Pages (from-to) | 30-42 |
Number of pages | 13 |
Journal | IEEE Transactions on Cognitive and Developmental Systems |
Volume | 12 |
Issue number | 1 |
DOIs | |
Publication status | Published - Mar 2020 |
Keywords
- Abnormal event detection
- convolutional long-short term memory (LSTM)
- reconstruction error probability
- two-stream fusion
- variational autoencoder (VAE)