Conversation
| for _ in range(vid_len - 2): | ||
| ret, frame2 = cap.read() | ||
| curr = cv2.cvtColor(frame2, cv2.COLOR_BGR2GRAY) | ||
| curr = cv2.resize(curr, (_IMAGE_SIZE, _IMAGE_SIZE)) |
There was a problem hiding this comment.
in the original author's implementation, image is first resized preserving aspect ratio (the smallest dimension is 256 pixels), then it's cropped to 224x224.
There was a problem hiding this comment.
Are you sure about this? It says in the README of this repository that:
For RGB, the videos are resized preserving aspect ratio so that the smallest dimension is 256 pixels, with bilinear interpolation. Pixel values are then rescaled between -1 and 1. During training, we randomly select a 224x224 image crop, while during test, we select the center 224x224 image crop from the video. The provided .npy file thus has shape (1, num_frames, 224, 224, 3) for RGB, corresponding to a batch size of 1.
For the Flow stream, after sampling the videos at 25 frames per second, we convert the videos to grayscale. We apply a TV-L1 optical flow algorithm, similar to this code from OpenCV. Pixel values are truncated to the range [-20, 20], then rescaled between -1 and 1. We only use the first two output dimensions, and apply the same cropping as for RGB. The provided .npy file thus has shape (1, num_frames, 224, 224, 2) for Flow, corresponding to a batch size of 1.
I take that to mean that the flow images are not resized at all until they are used as input to the model, which suggests that their resizing happens in-graph.
|
I often get |
Adding the scripts to download and extract the TV-L1 optical flow from HMDB-51.