Video Anomaly Detection Model Trained During Pixel Play '26. The model architecture is a reproduction of the paper Y. Zhao, B. Deng, C. Shen, Y. Liu, H. Lu and X.-S. Hua, “Spatio-Temporal AutoEncoder for Video Anomaly Detection,” in Proceedings of the 25th ACM International Conference on Multimedia (MM ’17), Mountain View, CA, USA, Oct. 2017, pp. 1933–1941, doi: 10.1145/3123266.3123451.
A detailed written report is available: Report.pdf
| Man Running (Video 03) | Boy Cavorting (Video 07) | Man Throwing a Bag (Video 05) |
|---|---|---|
![]() |
![]() |
![]() |
spatio_temporal_encoder_dual_decoder_model.ipynb consists of a sequential implementation of the model. The jupyter notebook was written in a Google Colab environment.
- Importing dataset from Kaggle.
- Preprocessing training dataset
- Model architecture
- Training loop
- Inference pipeline
- Preprocessing testing dataset (for flipped images)
- Generating
submission.csvcontaining anomaly scores for each video - Applying median filter to smoothen out anomaly scores
- Visualizing anomaly score graphs with matplotlib
- Visualizing video reconstruction and prediction capabilities of the model with matplotlib and PIL
visualize.py helps in writing anomaly score at the top left corner of each video frame. Requires the csv file containing anomaly scores and the test dataset readily available.
model_final_checkpoint.pth contains the model's parameters for Epoch 40 of training. Saved during cuda runtime.
object_centric_vad.ipynb consists of a sequential implementation of the object-centric autoencoders and binary classifiers. This approach was based on the paper R. T. Ionescu, F. Shahbaz Khan, M.-I. Georgescu, and L. Shao, “Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video,” arXiv, Dec. 11, 2018. [Online]. Available: https://arxiv.org/abs/1812.04960
All video clips, graphs and images for visualization have been generated through code.
- Anomaly score graphs are available for every testing video in
inference_graphs/. - Testing videos (mp4) with anomaly scores in
inference_videos/
Testing videos have been generated by running visualize.py to write anomaly scores on each frame. The frames were combined to form an mp4 video with the help of ShutterEncoder using the following settings:
- Function H.264
- Bitrates adjustment
- Video bitrate: 21
CQ - Max Quality enabled
- Video bitrate: 21
- Image sequence
- 15 fps
- Advanced features
- Force tune:
film
- Force tune:


