Compound Eye Talks Future of Transportation Tech with the Autonocast
3/9/22
Interview
We were delighted to get the chance to talk cameras and 3D perception with Ed Niedermeyer and Kirsten Korosec on The Autonocast. This episode is jam-packed with information. So, I've taken the time to offer high-level takeaways below.
Beyond camera vs lidar
"If I was brought in to run this effort at Tesla tomorrow, the most important thing that I would be doing would not be restoring the radar or finding a lidar start-up to crown as the winner by putting them on every Tesla. I would move the cameras on the car, putting them in a different place." - Jason Devitt, CEO
There are two myths fueling a heated debate in the autonomous vehicle space:
- Cameras simply "guess" the distance to nearby objects using neural networks.
- Lidar is the only reliable sensor for measuring depth or for recovering accurate scene geometry.
Both are incorrect. Cameras with overlapping fields of view can measure depth from parallax using techniques called multi-view geometry or stereo.
The origin of the camera myth
Companies like Tesla use monocular vision systems to power advanced driver assistance features. With this approach*, depth is estimated using neural networks that rely on prior knowledge about discrete objects (bikes, pedestrians, other cars), their size, and semantic cues (parallel lines appear to converge as they recede into the distance; points higher in the image are generally farther away, and so on). If the system encounters something it has never seen in training, it will fail.
The loudest voices in perception have highlighted both the power and pitfalls of monocular techniques without considering the staggering potential of parallax: more than one camera with overlapping fields of view.
*Monocular systems can also use motion parallax to recover depth information about stationary objects. Motion parallax does not work when the car is stationary, e.g. when you are in the middle of an unprotected turn and trying to gauge the distance to oncoming traffic, or just moving off from a parked position. It also does not give accurate returns on other moving objects.
The origin of the lidar myth
The original lidar versus camera showdown occurred during the DARPA Grand Challenges. Except, it wasn't much of a showdown. In the early 2000s, there was no viable solution based on computer vision.
"If I had ten years to program the software, I'd use cameras." - David Hall, founder of Velodyne, DARPA Grand Challenge 2004
Lidar was the right decision… in 2004!