This project is part of the 'Free-play Sandbox' experimental framework for Cognitive Human-Robot Interaction research. Check the PInSoRo website for details and documentation.
A rqt GUI to visualise and analyse the 'freeplay sandbox' recordings.
The interface is heavily based on rqt_bag.
rosbag play <bag file>
You need to compile first the replay_with_poses utility:
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make
Then:
./replay_with_poses --topics camera_purple/rgb/image_raw/compressed camera_yellow/rgb/image_raw/compressed --path $DATASET/<session>/
replay_with_poses has many options (to display or not raw video stream,
skeletons, gaze estimate, etc.). Check replay_with_poses --help.
Using the --video <output.mkv> option, you can export the resulting video. The
script generate_videos shows how to use this
feature to extract the video streams for the whole dataset.
Gaze estimation requires the Caffe machine learning
framework which is packaged in recent
versions of Ubuntu -- apt install caffe-cuda.
Once Caffe is installed, reconfigure and recompile replay_with_pose.
replay_with_poses --help should now have options to enable gaze estimation:
We rely on CMU OpenPose to extract 2D skeleton & facial features. You must first install OpenPose (which itself requires a working Caffe installation).
Once OpenPose is installed, you can enable the CMake option WITH_OPENPOSE
(cmake -DWITH_OPENPOSE=ON ..) and recompile.
This generates a new tool calls extract_poses that takes a bag file as
parameter, extract all the skeleton, facial features and hands from the video
streams, and store them as a JSON file. Type extract_poses --help for details.
The extracted pose files (typically, freeplay.poses.json) have the following
structure (see also figures on the OpenPose
website):
{<topic_name>:
{"frames" : [{
"ts": <timestamp in floating sec>,
"poses": {
"1": [ # pose index
# x,y in image coordinates (pixels), c is confidence in [0.0,1.0]
[x, y, c], # 0- Nose
[x, y, c], # 1- Neck
[x, y, c], # 2- RShoulder
[x, y, c], # 3- RElbow
[x, y, c], # 4- RWrist
[x, y, c], # 5- LShoulder
[x, y, c], # 6- LElbow
[x, y, c], # 7- LWrist
[x, y, c], # 8- RHip
[x, y, c], # 9- RKnee
[x, y, c], # 10- RAnkle
[x, y, c], # 11- LHip
[x, y, c], # 12- LKnee
[x, y, c], # 13- LAnkle
[x, y, c], # 14- REye
[x, y, c], # 15- LEye
[x, y, c], # 16- REar
[x, y, c] # 17- LEar
],
"2": [ # if present, second skeleton
...
]
},
"faces": {
"1": [ # face index
# x,y in image coordinates, c is confidence in [0.0,1.0]
[x, y, c],
... # 70 points in total, see OpenPose website for indices
],
"2": [
...
]
}
"hands": {
"1": { # hand index
"left": [
# x,y in image coordinates, c is confidence in [0.0,1.0]
[x, y, c],
... # 20 points in total, see OpenPose website for indices
],
"right": [
# x,y in image coordinates, c is confidence in [0.0,1.0]
[x, y, c],
... # 20 points in total
]
},
"2":
...
}
},
{ # 2nd frame
"ts": ...
"poses":
...
}
]
}
}
Because these JSON files are typically large (>100MB for 20-25 min), we
recommend you carefully choose your JSON
library both in terms of
parsing speed and memory requirements (hint: libjsoncpp does not do great).
In our experience, nlohmann JSON library does a decent job. simdjson is however probably the fastest option today.
You can extract a video stream from one of the bag file and save it as a video.
First, install the bag_tools:
sudo apt install ros-kinetic-bag-toolsThen:
rosrun bag_tools make_video.py <topic> <bag> --output <output.mp4> --fps <fps>For instance:
rosrun bag_tools make_video.py env_camera/qhd/image_color/compressed freeplay.bag --output freeplay_env.mp4 --fps 28.0(note that, due to an upstream bug, one needs first to replace the type of the fps
parameter from int to float in make_video.py for non-integer FPS to work)
Alternatively, you can use replay_with_pose to replay the videos (including
overlaying skeleton/face features if desired) and save them as video files. See
above for details.
To extract audio and save it as an audio file:
rosrun audio_play audio_play audio:=<topic> _dst:=<file.ogg>For instance:
rosrun audio_play audio_play audio:=camera_purple/audio _dst:=freeplay_purple.ogg &
rosbag play freeplay.bagOnce the video streams have been extracted from the bag files (cf above), you
can use the script create_video_mosaic to
generate a mosaic with 49 streams tiled together. This requires ffmpeg.
