Technical Summary

Synapse of the HumanPose System

The code of this repository relies on the Intel® distribution of OpenVINO™ toolkit. OpenVINO (Open Visual Inference and Neural network Optimization) is a free toolkit facilitating the optimization of deep learning models and deployment onto Intel hardware. Note that the current release of this repository is intended to be run on an Intel® CPU (deployment on iGPU, Intel® Movidius™ Vision Processing Unit (VPU) could be considered as well in future releases). I have made another release of this project but relying on the original release of Openpose and that needs a powerful GPU to run fast enough. Thanks to OpenVINO, the Tello Selfie Assistant can work on a much broader variety of hardware configuration.

HumanPose Content

tello_selfie_assistant.py

usage: tello_selfie_assistant.py [-h] [-l LOG_LEVEL] [-1] [--no_sound]
                                 [-k KEYBOARD_LAYOUT] [-s HEIGHT_SIZE]

optional arguments:
  -h, --help            show this help message and exit
  -l LOG_LEVEL, --log_level LOG_LEVEL
                        select a log level (info, debug,...)
  -1, --monoprocess     use 1 process (instead of 2)
  --no_sound            Desactivate sound
  -k KEYBOARD_LAYOUT, --keyboard_layout KEYBOARD_LAYOUT
                        Your keyboard layout (QWERTY or AZERTY)
                        (default=QWERTY)
  -s HEIGHT_SIZE, --height_size HEIGHT_SIZE
                        Network input layer height size. The smaller the
                        faster (default=256)

--height_size HEIGHT_SIZE

Before a video frame is processed by the human pose model, the frame is first resized to the input layer size (input_height, input_width). By default input_height = 256, and input_width = input_height * frame_width / frame_height. The model can be reshaped by giving a new value to input_height. If you give a smaller value to input_height, the inference will be faster, but the pose estimation could be less accurate. My advice: start with the default value (256) and if the FPS is to slow, decrease by 10%, and so on until you find the best compromise.

--monoprocess

You shouldn't need to use this option. By default (without this option), the application uses 2 processes: one process receives the video frames sent by the drone, one process runs the pose estimation, recognizes poses, and sends flight commands back to the drone. With the option '--monoprocess', only one process manages communications in both directions, and the FPS is a bit smaller.

--no_sound

You shouldn't need to use this option. It is convenient to have audio feedback when some actions are triggered.

--keyboard_layout KEYBOARD_LAYOUT

In case you want to use your keyboard to pilot the drone, use this argument to select your keyboard layout. Here is the mapping key/action :

2. human_pose.py

This script is the module doing the pose estimation. If you have a webcam and have run the container with --device=/dev/video0:/dev/video0 as explained above, you can directly apply the human pose estimation on your webcam stream.

python human_pose.py -i /dev/video0

3. test_gestures.py

Like for human_pose.py, you can use your webcam video stream to test and practice the different poses predefined in tello_selfie_assistant.py. Simply run:

python test_gestures.py -i /dev/video0

4. humanpose_driver.py

This allows for the ROS Melodic Conversion for the HumanPose system to interface with other drone types. This would be used to launch a ROS node.

roscore
roslaunch humanpose_driver

Future Plans:

Currently, the HumanPose development cycle is looking to move to a more hardware-independent platform that allows for use on any aerial vehicle over ROS. This would allow for the integration of the HumanPose system to work seamlessly with the GAAS and Redtail projects into a larger project encompassing all of the work done so far.

Last updated