# Software Setup

## Synapse of the HumanPose System

&#x20;The code of this repository relies on the [Intel® distribution of OpenVINO™ toolkit](https://software.intel.com/en-us/openvino-toolkit). OpenVINO (Open Visual Inference and Neural network Optimization) is a free toolkit facilitating the optimization of deep learning models and deployment onto Intel hardware. Note that the current release of this repository is intended to be run on an Intel® CPU (deployment on iGPU, Intel® Movidius™ Vision Processing Unit (VPU) could be considered as well in future releases). I have made another release of this project but relying on the original release of Openpose and that needs a powerful GPU to run fast enough. Thanks to OpenVINO, the Tello Selfie Assistant can work on a much broader variety of hardware configuration.

## Docker&#x20;

### How to Install Docker

The HumanPose system is stored inside of a docker container, to best utilize the full capability of the HumanPose software, it is highly recommended that you run the program through Docker on Ubuntu 18.04 64-bit.&#x20;

Follow the step by step instructions from docker to install

{% embed url="<https://docs.docker.com/engine/install/ubuntu/>" %}

## Docker Contents

#### Content of the docker image

1. Installation of Ubuntu 18.04
2. Installation of a few packages (compiler, cmake, interface to sound drivers, git,...)
3. Clone of this Github repository
4. Installation of python via miniconda. I have chosen miniconda because install of pyav is much simpler with conda than with pip
5. Download and installation of the following packages of OpenVINO:
   * Inference Engine Development Kit
   * Inference Engine Runtime for Intel CPU
   * OpenCV (optimized for Intel processors)
6. Installation the following python packages:
   * TelloPy : Tello drone controller from <https://github.com/hanyazou/TelloPy>. I have slightly modified the version so that I can run it in 2 processes. This modified version is in the TelloPy directory of this repository;
   * simple-pid : a simple and easy to use PID controller (<https://github.com/m-lundberg/simple-pid>);
   * pynput : to read keyboard events (the keyboard can also be used to pilot the drone);
   * pygame: to play sounds (when poses are recognized, gives  audio feedback).

## Docker Commands

On your Docker host, you first need to authorize access to the X server :

```bash
xhost +local:
```

Then run the docker image :

```bash
docker run -it --rm -e DISPLAY=${DISPLAY} -v /tmp/.X11-unix:/tmp/.X11-unix --network host --device /dev/snd  -e PULSE_SERVER=unix:${XDG_RUNTIME_DIR}/pulse/native -v ${XDG_RUNTIME_DIR}/pulse/native:${XDG_RUNTIME_DIR}/pulse/native -v ~/.config/pulse/cookie:/root/.config/pulse/cookie -v $PICTURE_DIR:/work/pictures --name tello geaxgx/tello_humanpose_openvino:latest
```

This will allow for the basic upstart of the docker container.

## Container  Contents

#### tello\_selfie\_assistant.py

```bash
usage: tello_selfie_assistant.py [-h] [-l LOG_LEVEL] [-1] [--no_sound]
                                 [-k KEYBOARD_LAYOUT] [-s HEIGHT_SIZE]

optional arguments:
  -h, --help            show this help message and exit
  -l LOG_LEVEL, --log_level LOG_LEVEL
                        select a log level (info, debug,...)
  -1, --monoprocess     use 1 process (instead of 2)
  --no_sound            Desactivate sound
  -k KEYBOARD_LAYOUT, --keyboard_layout KEYBOARD_LAYOUT
                        Your keyboard layout (QWERTY or AZERTY)
                        (default=QWERTY)
  -s HEIGHT_SIZE, --height_size HEIGHT_SIZE
                        Network input layer height size. The smaller the
                        faster (default=256)
```

**--height\_size HEIGHT\_SIZE**

Before a video frame is processed by the human pose model, the frame is first resized to the input layer size (*input\_height*, *input\_width*). By default *input\_height* = 256, and *input\_width* = *input\_height \* frame\_width / frame\_height*. The model can be [reshaped](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_ShapeInference.html#usage_of_reshape_method) by giving a new value to *input\_height*. If you give a smaller value to *input\_height*, the inference will be faster, but the pose estimation could be less accurate. My advice: start with the default value (256) and if the FPS is to slow, decrease by 10%, and so on until you find the best compromise.

**--monoprocess**

You shouldn't need to use this option. By default (without this option), the application uses 2 processes: one process receives the video frames sent by the drone, one process runs the pose estimation, recognizes poses and sends flight commands back to the drone. With the option '--monoprocess', only one process manages communications in both directions, and the FPS is a bit smaller.

**--no\_sound**

You shouldn't need to use this option. It is convenient to have an audio feedback when some actions are triggered.

**--keyboard\_layout KEYBOARD\_LAYOUT**

In case you want to use your keyboard to pilot the drone, use this argument to select your keyboard layout. Here is the mapping key/action :

| QWERTY      | AZERTY      | Action                      |
| ----------- | ----------- | --------------------------- |
| w           | z           | Go forward                  |
| s           | s           | Go backward                 |
| a           | q           | Go Left                     |
| d           | d           | Go right                    |
| q           | a           | Rotate left                 |
| e           | e           | Rotate Right                |
| ←           | ←           | Rotate left                 |
| →           | →           | Rotate right                |
| ↑           | ↑           | Go up                       |
| ↓           | ↓           | Go down                     |
| Tab         | Tab         | Takeoff                     |
| Backspace   | Backspace   | Landing                     |
| p           | p           | Palm Landing                |
| t           | t           | Toggle tracking             |
| h           | h           | Toggle HumanPose estimation |
| Enter       | Enter       | Take a picture              |
| 0,1,2,3,4,5 | 0,1,2,3,4,5 | Set Video encoder rate      |
| 7           | 7           | Decrease exposure           |
| 8           | 8           | Auto exposure               |
| 9           | 9           | Increase exposure           |

### 2. human\_pose.py

&#x20;This script is the module doing the pose estimation. If you have a webcam and have run the container with `--device=/dev/video0:/dev/video0` as explaned above, you can directly apply the human pose estimation on your webcam stream.

```bash
python human_pose.py -i /dev/video0
```

### 3. test\_gestures.py

&#x20;Like for `human_pose.py`, you can use your webcam video stream to test and practice the different poses predefined in `tello_selfie_assistant.py`. Simply run:

```bash
python test_gestures.py -i /dev/video0
```

### 4. humanpose\_driver.py

This allows for the ROS Melodic Conversion for the HumanPose system to interface with other drone types. This would be used to launch a ROS node.&#x20;

```bash
roscore
roslaunch humanpose_driver
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://clemson-autonomous-systems.gitbook.io/clemson-university-autonomous-systems/development-projects/tello-humanpose/software-setup.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
