Source code for project Panorama. It is a domain-agnostic video analytics system designed for mitigating the unbounded vocabulary problem. Please check our tech report for more details.
- Python, Keras and TensorFlow. Only tested with Python 2.7, Keras==2.1.4 and tensorflow==1.4.0 on both Ubuntu 16.04.3 LTS. The deployment demos below are also tested with with Keras==2.2.4 and tensorflow==1.12 on OS X 10.14.5 (Mojave).
ffmpegis also needed for video processing. - The requirements can be installed by
OS X users: you need to change
pip install -r requirements.txttensorflow-gputotensorflowin therequirements.txt. - An existing reference model that is capable of object detection and fine-grained classification. Alternatively, a fully annotated (bounding boxes, labels) dataset targeting your application can be used.
- A CUDA-enabled GPU is highly recommended.
We provide an example of deploying Panorama on face recognition tasks. You can download the pre-trained weights and relevent config files here. The tarball contains three files:
faces_config.jsonThe Panorama configuration file generated during training.panorama_faces_original_loss_weights.h5The PanoramaNet weights.panorama_faces_original_loss_weights.csvThe model qualification file required to configure PanoramaNet's cascade processing.
Put these files under folder .../Panorama-UCSD/trained_models (create the folder if not exists).
Run the video demo simply by
cd panorama/examples
python demo.py
Panorama will then start to detect faces and you can see the video feed with bounding boxes.
- Now press
sto enter annotation mode. A freezed image will pop up. - Move your mouse to the bounding box that you intend to annotate. The box will change color as you hover upon it.
- Click the box and the program will prompt you for label. Input the label and press enter. Repeate to label other objects on the image window. Once finished press
cto exit annotation mode. - The video will resume playing. Panorama's vocabulary is now enlarged to recognize the people's identities. No CNN retraining happened during this process.
- Press
qto quit.
This is an example of training and deploying Panorama on face recognition tasks, as described in our paper.
- Prepare a long-enough video (~50 hrs) from your video stream. We used
CBSN (https://www.cbsnews.com/live/) in our paper for face recognition tests.
Create a directory for storing the data by:
Then put your video under
$ mkdir -p dataset/faces/raw $ mkdir -p dataset/faces/video
dataset/faces/video. - Unpack the video into frames and deploy you reference model on the frames to get
weekly supervised data. You can use the same reference model we used, which is MTCNN + FaceNet. These steps are described in Section 4.3 of our paper.
-
First create a dir for storing model weights.
$ mkdir -p trained_models/align -
Then go to link, download the weights for facenet, unzip, put the extracted folder under
trained_models. -
Download the weights for MTCNN. Go to link and download all three
*.npyfiles totrained_weights/align. -
Generate data by running
panorama/data/generate_data.sh. You may need to modify the paths in this script.$ cd panorama/data $ ./generate_data.shThis will take several hours.
-
- Start training by going to
panorama/examplesand executingrun.sh. The hyperparameters are tunned for this specific example. Depending on your hardware, the training can take up to several days. (~ 2d on a single GTX 1080Ti). This script will also look at your trainning data and generate a config file namedfaces.json, which we will need later. - Once the training is done. We now move on to the next step of configuring the cascade, as described in the paper. The script to do this is
panorama/example/model_qualification.sh. - For recognition usage, you need a labeled dataset (you can use the one generated above) to poll an album. Change the paths in
panorama/examples/recognition_ytf.pyand run it. - Panorama is now ready for deploying. Please check
examples/examples.ipynbfor, well, examples.
