Software Setup
HumanPose Commands HumanPose Emergency Controls
Synapse of the HumanPose System
The code of this repository relies on the Intel® distribution of OpenVINO™ toolkit. OpenVINO (Open Visual Inference and Neural network Optimization) is a free toolkit facilitating the optimization of deep learning models and deployment onto Intel hardware. Note that the current release of this repository is intended to be run on an Intel® CPU (deployment on iGPU, Intel® Movidius™ Vision Processing Unit (VPU) could be considered as well in future releases). I have made another release of this project but relying on the original release of Openpose and that needs a powerful GPU to run fast enough. Thanks to OpenVINO, the Tello Selfie Assistant can work on a much broader variety of hardware configuration.
Docker
How to Install Docker
The HumanPose system is stored inside of a docker container, to best utilize the full capability of the HumanPose software, it is highly recommended that you run the program through Docker on Ubuntu 18.04 64-bit.
Follow the step by step instructions from docker to install
Docker Contents
Content of the docker image
Installation of Ubuntu 18.04
Installation of a few packages (compiler, cmake, interface to sound drivers, git,...)
Clone of this Github repository
Installation of python via miniconda. I have chosen miniconda because install of pyav is much simpler with conda than with pip
Download and installation of the following packages of OpenVINO:
Inference Engine Development Kit
Inference Engine Runtime for Intel CPU
OpenCV (optimized for Intel processors)
Installation the following python packages:
TelloPy : Tello drone controller from https://github.com/hanyazou/TelloPy. I have slightly modified the version so that I can run it in 2 processes. This modified version is in the TelloPy directory of this repository;
simple-pid : a simple and easy to use PID controller (https://github.com/m-lundberg/simple-pid);
pynput : to read keyboard events (the keyboard can also be used to pilot the drone);
pygame: to play sounds (when poses are recognized, gives audio feedback).
Docker Commands
On your Docker host, you first need to authorize access to the X server :
xhost +local:
Then run the docker image :
docker run -it --rm -e DISPLAY=${DISPLAY} -v /tmp/.X11-unix:/tmp/.X11-unix --network host --device /dev/snd -e PULSE_SERVER=unix:${XDG_RUNTIME_DIR}/pulse/native -v ${XDG_RUNTIME_DIR}/pulse/native:${XDG_RUNTIME_DIR}/pulse/native -v ~/.config/pulse/cookie:/root/.config/pulse/cookie -v $PICTURE_DIR:/work/pictures --name tello geaxgx/tello_humanpose_openvino:latest
This will allow for the basic upstart of the docker container.
Container Contents
tello_selfie_assistant.py
usage: tello_selfie_assistant.py [-h] [-l LOG_LEVEL] [-1] [--no_sound]
[-k KEYBOARD_LAYOUT] [-s HEIGHT_SIZE]
optional arguments:
-h, --help show this help message and exit
-l LOG_LEVEL, --log_level LOG_LEVEL
select a log level (info, debug,...)
-1, --monoprocess use 1 process (instead of 2)
--no_sound Desactivate sound
-k KEYBOARD_LAYOUT, --keyboard_layout KEYBOARD_LAYOUT
Your keyboard layout (QWERTY or AZERTY)
(default=QWERTY)
-s HEIGHT_SIZE, --height_size HEIGHT_SIZE
Network input layer height size. The smaller the
faster (default=256)
--height_size HEIGHT_SIZE
Before a video frame is processed by the human pose model, the frame is first resized to the input layer size (input_height, input_width). By default input_height = 256, and input_width = input_height * frame_width / frame_height. The model can be reshaped by giving a new value to input_height. If you give a smaller value to input_height, the inference will be faster, but the pose estimation could be less accurate. My advice: start with the default value (256) and if the FPS is to slow, decrease by 10%, and so on until you find the best compromise.
--monoprocess
You shouldn't need to use this option. By default (without this option), the application uses 2 processes: one process receives the video frames sent by the drone, one process runs the pose estimation, recognizes poses and sends flight commands back to the drone. With the option '--monoprocess', only one process manages communications in both directions, and the FPS is a bit smaller.
--no_sound
You shouldn't need to use this option. It is convenient to have an audio feedback when some actions are triggered.
--keyboard_layout KEYBOARD_LAYOUT
In case you want to use your keyboard to pilot the drone, use this argument to select your keyboard layout. Here is the mapping key/action :
QWERTY
AZERTY
Action
w
z
Go forward
s
s
Go backward
a
q
Go Left
d
d
Go right
q
a
Rotate left
e
e
Rotate Right
←
←
Rotate left
→
→
Rotate right
↑
↑
Go up
↓
↓
Go down
Tab
Tab
Takeoff
Backspace
Backspace
Landing
p
p
Palm Landing
t
t
Toggle tracking
h
h
Toggle HumanPose estimation
Enter
Enter
Take a picture
0,1,2,3,4,5
0,1,2,3,4,5
Set Video encoder rate
7
7
Decrease exposure
8
8
Auto exposure
9
9
Increase exposure
2. human_pose.py
This script is the module doing the pose estimation. If you have a webcam and have run the container with --device=/dev/video0:/dev/video0
as explaned above, you can directly apply the human pose estimation on your webcam stream.
python human_pose.py -i /dev/video0
3. test_gestures.py
Like for human_pose.py
, you can use your webcam video stream to test and practice the different poses predefined in tello_selfie_assistant.py
. Simply run:
python test_gestures.py -i /dev/video0
4. humanpose_driver.py
This allows for the ROS Melodic Conversion for the HumanPose system to interface with other drone types. This would be used to launch a ROS node.
roscore
roslaunch humanpose_driver
Last updated
Was this helpful?