Technical Summary
Last updated
Last updated
This package utilizes a optical flow technique to estimate the pose of the camera relative to the environment around it. From our tests, we found that a significant amount of error would accumulate causing the pose to be far offset from the true position of the camera. Lowering the sampling frame rate helping reduce this noise, but sudden movements in the frame would result in inaccurate estimations of the current visual pose. We determined that it is not necessary to use this package for estimating the vehicles pose, instead we depend on the local position estimate from the flight controller, which looks at the downward facing lidar for height estimation and accelerometer data for position/orientation. We are still in the process of experimenting with this package to find out if it is possible to get it tuned and integrated into our pose estimate without having to worry about encountering large amounts of noise/error in our local position estimate. This error could result in crashing during autonomous navigation.
In order to get this package working on a 64bit ARM processor, which is what the Jetson TX2 is using, we had to create our own version of FAST corner detection that would follow the RISC neon compilation instructions over CISC SSE2 instructions for x86/x64 processors.
The disparity image is computed using OpenCV's Block Matching Algorithm. From there a WLS filter generates what they call a 'dense reconstruction' of the block matching disparity image. From there, the point cloud library (PCL) publishes a pointcloud over to an script that combines the current pose data (whether it be from SLAM, or the ZED itself) in order to generate a occupancy map.
The inherent challenge here is generating a disparity image that captures just enough of the environment that it can be efficiently converted into a fused point cloud. Otherwise, if there is too much data to compute, the point cloud will be published in the wrong position in the occupancy map. A well tuned disparity image requires several iterations of fine tuning the Block Matching and WLS filter parameters. A frequency of ~1-2 Hz generates good results for the fused point cloud and resulting occupancy map. An example of the disparity image and occupancy map is shown below
Using a semi-global block matching disparity image would provide much better detail and ideally would generate a better occupancy map however the computational cost is too great. The point clouds frequency falls down to 0.25-0.5 Hz which is insufficient for path planning purposes.
We are looking into utilizing the GPU to generate a much higher quality base disparity image over the block matching approach. This way we dont have to sacrifice a loss of important environment information, which could result in a obstacle not being detected. We will then apply the WLS filter to this new image and compare the resulting fused point cloud / occupancy map to our current results.
See 'Generating Disparity Images from the Stereo DNN' in the Technical Summary of the NVIDIA Redtail Project.