Software Setup

HumanPose Commands HumanPose Emergency Controls

Synapse of the HumanPose System

The code of this repository relies on the Intel® distribution of OpenVINO™ toolkit. OpenVINO (Open Visual Inference and Neural network Optimization) is a free toolkit facilitating the optimization of deep learning models and deployment onto Intel hardware. Note that the current release of this repository is intended to be run on an Intel® CPU (deployment on iGPU, Intel® Movidius™ Vision Processing Unit (VPU) could be considered as well in future releases). I have made another release of this project but relying on the original release of Openpose and that needs a powerful GPU to run fast enough. Thanks to OpenVINO, the Tello Selfie Assistant can work on a much broader variety of hardware configuration.

Docker

How to Install Docker

The HumanPose system is stored inside of a docker container, to best utilize the full capability of the HumanPose software, it is highly recommended that you run the program through Docker on Ubuntu 18.04 64-bit.

Follow the step by step instructions from docker to install

Docker Contents

Content of the docker image

  1. Installation of Ubuntu 18.04

  2. Installation of a few packages (compiler, cmake, interface to sound drivers, git,...)

  3. Clone of this Github repository

  4. Installation of python via miniconda. I have chosen miniconda because install of pyav is much simpler with conda than with pip

  5. Download and installation of the following packages of OpenVINO:

    • Inference Engine Development Kit

    • Inference Engine Runtime for Intel CPU

    • OpenCV (optimized for Intel processors)

  6. Installation the following python packages:

    • TelloPy : Tello drone controller from https://github.com/hanyazou/TelloPy. I have slightly modified the version so that I can run it in 2 processes. This modified version is in the TelloPy directory of this repository;

    • simple-pid : a simple and easy to use PID controller (https://github.com/m-lundberg/simple-pid);

    • pynput : to read keyboard events (the keyboard can also be used to pilot the drone);

    • pygame: to play sounds (when poses are recognized, gives audio feedback).

Docker Commands

On your Docker host, you first need to authorize access to the X server :

xhost +local:

Then run the docker image :

docker run -it --rm -e DISPLAY=${DISPLAY} -v /tmp/.X11-unix:/tmp/.X11-unix --network host --device /dev/snd  -e PULSE_SERVER=unix:${XDG_RUNTIME_DIR}/pulse/native -v ${XDG_RUNTIME_DIR}/pulse/native:${XDG_RUNTIME_DIR}/pulse/native -v ~/.config/pulse/cookie:/root/.config/pulse/cookie -v $PICTURE_DIR:/work/pictures --name tello geaxgx/tello_humanpose_openvino:latest

This will allow for the basic upstart of the docker container.

Container Contents

tello_selfie_assistant.py

usage: tello_selfie_assistant.py [-h] [-l LOG_LEVEL] [-1] [--no_sound]
                                 [-k KEYBOARD_LAYOUT] [-s HEIGHT_SIZE]

optional arguments:
  -h, --help            show this help message and exit
  -l LOG_LEVEL, --log_level LOG_LEVEL
                        select a log level (info, debug,...)
  -1, --monoprocess     use 1 process (instead of 2)
  --no_sound            Desactivate sound
  -k KEYBOARD_LAYOUT, --keyboard_layout KEYBOARD_LAYOUT
                        Your keyboard layout (QWERTY or AZERTY)
                        (default=QWERTY)
  -s HEIGHT_SIZE, --height_size HEIGHT_SIZE
                        Network input layer height size. The smaller the
                        faster (default=256)

--height_size HEIGHT_SIZE

Before a video frame is processed by the human pose model, the frame is first resized to the input layer size (input_height, input_width). By default input_height = 256, and input_width = input_height * frame_width / frame_height. The model can be reshaped by giving a new value to input_height. If you give a smaller value to input_height, the inference will be faster, but the pose estimation could be less accurate. My advice: start with the default value (256) and if the FPS is to slow, decrease by 10%, and so on until you find the best compromise.

--monoprocess

You shouldn't need to use this option. By default (without this option), the application uses 2 processes: one process receives the video frames sent by the drone, one process runs the pose estimation, recognizes poses and sends flight commands back to the drone. With the option '--monoprocess', only one process manages communications in both directions, and the FPS is a bit smaller.

--no_sound

You shouldn't need to use this option. It is convenient to have an audio feedback when some actions are triggered.

--keyboard_layout KEYBOARD_LAYOUT

In case you want to use your keyboard to pilot the drone, use this argument to select your keyboard layout. Here is the mapping key/action :

QWERTY

AZERTY

Action

w

z

Go forward

s

s

Go backward

a

q

Go Left

d

d

Go right

q

a

Rotate left

e

e

Rotate Right

Rotate left

Rotate right

Go up

Go down

Tab

Tab

Takeoff

Backspace

Backspace

Landing

p

p

Palm Landing

t

t

Toggle tracking

h

h

Toggle HumanPose estimation

Enter

Enter

Take a picture

0,1,2,3,4,5

0,1,2,3,4,5

Set Video encoder rate

7

7

Decrease exposure

8

8

Auto exposure

9

9

Increase exposure

2. human_pose.py

This script is the module doing the pose estimation. If you have a webcam and have run the container with --device=/dev/video0:/dev/video0 as explaned above, you can directly apply the human pose estimation on your webcam stream.

python human_pose.py -i /dev/video0

3. test_gestures.py

Like for human_pose.py, you can use your webcam video stream to test and practice the different poses predefined in tello_selfie_assistant.py. Simply run:

python test_gestures.py -i /dev/video0

4. humanpose_driver.py

This allows for the ROS Melodic Conversion for the HumanPose system to interface with other drone types. This would be used to launch a ROS node.

roscore
roslaunch humanpose_driver

Last updated

Was this helpful?