-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Hello,
I am currently trying to implement a program that use Pixelwise Regression to estimate the pose of at least two hands on a depth video stream (one frame at the time).
I am using the Stereo Labs ZED Mini camera.
Since Pixelwise can only estimate the pose of only one hand on a frame, I begin by using Mediapipe Hands (I know it is overkill, I may change later) to locate the hands and crop them on the frame. Then I resize the cropped hands to a size of 128x128.
Finally I can use this code :
def estimate(self, img):
label_img = Resize(size=[self.label_size, self.label_size])(img)
label_img = tr.reshape(label_img, (1, 1, self.label_size, self.label_size))
mask = tr.where(label_img > 0, 1.0, 0.0)
img = img.to(self.device, non_blocking=True)
label_img = label_img.to(self.device, non_blocking=True)
mask = mask.to(self.device, non_blocking=True)
self.heatmaps, self.depthmaps, hands_uvd = self.model(img, label_img, mask)[-1]
hands_uvd = hands_uvd.detach().cpu().numpy()
self.hands_uvd = hands_uvd
return hands_uvd
After I looked into the values of img, label_img and mask when executing test_samples.py, I got the feeling that contrary to mine those matrix are normalized, which could be the cause of my poor results.
So is my feeling right and if so can you explain to me how I can do the same treatment on my matrix.
P.S. : I tested with both HAND17 and MSRA pretrained models.
Thank you for your work.