Calculate the distance to the detection boxes using the depth map and overlay the distance value in the center of each detection box.
Then run this on a few YouTube videos with city driving, and record the processing output and post it on YouTube directly from the notebook if possible.