Tesla Replicated The Visibility Of Lidar With Its Realtime Vision Processing System
Tesla’s progress with artificial intelligence and neural nets has propelled its Autopilot and Full Self Driving solutions to the front of the pack. This is the result of the brilliant work of a large team of Autopilot directors and staff, including Tesla’s Senior Director of AI, Andrej Karpathy. Karpathy presented Tesla’s methods for training its AI at the Scaled ML Conference in February. Along the way, he shared specific insights into Tesla’s methods for achieving the accuracy of traditional laser-based lidar with just a handful of cameras.
The secret sauce in Tesla’s ever-evolving solution is not the cameras themselves, but rather the advanced processing and neural nets they have built to make sense of the wide range and quality of inputs. One new technique Tesla’s AI team has built is called pseudo-lidar. It blends the lines between traditional computer vision and the powerful point map world of lidar.
Traditional lidar-based systems rely on an array of lidar hardware to provide an unparalleled view of the world around the vehicle. These systems leverage invisible lasers or similar tech to send a massive number of pings out into the world to detect surrounding objects.
A visualization of a Ford lidar system on display at CES in 2016. Photo by Kyle Field, CleanTechnica
The result is a realtime visualization of what the world around a vehicle looks like based on the distance of each laser point. The computer translates the points into a 3D representation and is able to identify other vehicles, humans, roads, buildings, and the like as a means of enabling the vehicle to navigate in that world more safely.
In recent years, the push towards autonomous driving has resulted in a massive surge in the development of lidar units themselves, and the supporting software solutions that use them. Even so, the cost of lidar systems continues to be prohibitive, with single sensors costing thousands of dollars each. Cameras, on the other hand, only cost a few dollars each, thanks to their prevalence in smartphones, laptops, and the like.
Tesla’s camera-based approach is much cheaper and easier to implement on the hardware side, but requires an insanely complex computer system to translate raw camera inputs and vehicle telematics into intelligence. At a foundational level, the computer can identify lane markings, signs, and other vehicles from a series of sequential static images, also known as a video.
Tesla is taking computer vision to unprecedented levels, analyzing not just images, but individual pixels within the image. “We take a pseudo-lidar approach where you basically predict the depth for every single pixel and you can cast out your pixels,” Karpathy said. Doing this over time replicates much of the functionality of a traditional lidar system, but requires a massive amount of realtime processing power for the image deconstructions to be of any use.
Vehicles are driven in realtime, so it doesn’t do any good to have a system that can make determinations or predictions based on an image if it the results are not available instantaneously. Thankfully, Tesla built its own hardware for the third major version of its autonomous driving computer and it was purpose-built to run Tesla’s code.
Screen capture from Karpathy’s presentation.
Achieving the functionality of lidar is important, as it unlocks all of the software solutions that were built to utilize inputs from traditional lidar systems. “You basically simulate lidar input that way, but it’s purely from vision. Then you can use a lot of techniques that have been developed for lidar processing to achieve 3D object detection.” It’s like giving a GPS to someone navigating a forest with just a compass and a map. It doesn’t solve the problem, but it is another extremely valuable tool in developing the best solution.
“The Gap is Quickly Closing”
Tesla’s so-called pseudo-lidar solution is getting better. Karpathy showed off a range of lidar-esque 3D maps of the world that look a heck of a lot like the results coming from cutting-edge lidar solutions. Of course, visualizations are more for the benefit of humans, not computers, so don’t truly communicate just how impactful Tesla’s progress with computer vision is. “If you give yourself lidar and how well you can do versus if you do not have lidar, but you just use vision techniques and psuedo-lidar approaches, the gap is quickly closing,” Karpathy said.
About the Author
Kyle Field I'm a tech geek passionately in search of actionable ways to reduce the negative impact my life has on the planet, save money and reduce stress. Live intentionally, make conscious decisions, love more, act responsibly, play. The more you know, the less you need. TSLA investor.