This project is about tasks for Computer Vision Course in Electronic Information and Communication Department, Huazhong University of Science and Technology, and the teacher is Prof. Xinggang Wang.
| name | usage |
|---|---|
| Proj1 | perceptron for linear classification |
| Proj2 | one layer neural network for classification |
| Proj3 | multi-layer neural network for mnist classification |
| Proj4 | pytorch version for mnist or cifar10 classification |
| Proj5 | pytorch version for single object localization |
| Proj6 | pytorch version for sematic segmentation |
| file | usage |
|---|---|
| function.py | define the activate function such as ReLU, softmax and sigmoid |
| Net.py | define the network structure and forward/backward process |
| mnist.py | how to load data and label from .gz files |
| main.py | batch learning and figure plotting |
Different activate functions will contribute to different output
Different init principle will contribute to different output
| file | usage |
|---|---|
| MyNet.py | generate a simply CNN model |
| ResNet.py | transfer learning from Res50 |
| VGG.py | transfer learning from VGG16 |
| Plot.py | plot the final results, such as loss and acc |
note: In ./pytorch/models I only store three models, because Res50 and VGG16 models are too huge to store in Github
| file | usage |
|---|---|
| Net.py | VGG16 pretrained network + customize |
| VGG_loss_weight.py | main file, use two loss funcs (cross entropy and Smooth L1) |
| VGGConV_xxx.py | using ConV instead fc |
| visualization.py | plot the final results |
| dataloader.py | load data from file as ndarray and tensor |
| tiny_vid.tar | dataset sampled from VID |
The following figures are "weight=1:1", "weight=1:1e-1", "weight=1:2e-2", "ConV network"
| file | usage |
|---|---|
| TinySeg | dataset sampled from VOC 2012 |
| backbone_8stride.py | ResNet backbone structure |
| eval_seg.py | evaluation code |
| train_seg.py | train code |
| test_seg.py | plot the output imgs |
| pspnet.py | PSP module and network |
| sync_batchnorm | syncronize batchnormal in DeepLab |
| DeepLab.py | define the deeplab v3 network |
| train_seg_deeplab.py | train stage for deeplab net |
H.Zhao, J.Shi, X.Qi, X.Wang and J.Jia. Pyramid Scene Parsing Network. CVPR2017
- mIoU is about 0.72, while the SOTA of PSPNet in VOC2012 is 0.82
L.Chen, G.Papandreou, F.Schroff, H.Adam. Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv:1706.05587v3
- mIoU is about 0.74


















