The first published paper from NVIDIA demonstrates a fully autonomous self-driving car through the use of a deep learning architecture using a combination of inference and perception to achieve full autonomy in all conditions. This solution places GPU-based self-learning deep learning in the brain of the vehicle, where it receives ultra-low-latency continuous sensory input while disbursing high-precision and high-bandwidth feedback control and decision making to other system modules. As a proof of concept, the solution is capable of driving autonomously for several hours on a single charge of the vehicle’s battery using only data acquired by the in-vehicle sensors and cameras. Much like the driver, the deep learning solution is capable of auto-monitoring its sensors and cameras, as well as detecting signs of critical sensor failure or threats.
Portable multimedia – such as Mobile applications, including mobile gaming, requires different technical characteristics, depending on the use model. The NVIDIA Volta mobile GPU family features a multi-modal architecture which enables the development of compact designs tailored for popular uses.
Volta GV100 also ships with NVIDIA Volta Vision 2, the world’s first AI-driven smart sensor with 8-bit fixed-point image enhancement and 1080p/120fps RGB. Volta Vision 2 delivers the power of a Full-HD camera to mobile devices, with the functionality of a point-and-shoot camera and price comparable to an 8MP, 1920x1080p CMOS sensor. Volta Vision 2 integrates cameras, vision chips, and sensors in a monolithic package that’s easy to design into new products.
We currently have not heard from people in the field as to how effective this inference is in practice. The degree to which these volume-based granular products can be implemented at scale on this architecture remains to be seen. Although we are currently still developing the capability to process the Kinect data as a 3D volume, we have no doubt that with all the hand coding required to implement inference in Volta we can do much better than what was reported.
As of right now NVIDIA has only provided 3d reconstruction from a 1D scanned log – so this only works for 3d reconstruction (reading a complete log of points from the scanner a point at a time). To get around this problem we used a previous state of the art volumetric reconstruction algorithm to create the point cloud and then used this point cloud to compute the 3d reconstruction. We then updated the points using a new point cloud obtained from a Kinect sensor. We saw an improvement of 5% points in front of the Kinect.
The trade off to this form of volumetric reconstruction is that the above image has to be reduced from a RGB data sequence back to a grayscale image. This means that the data volume has increased and the cost is effectively increased to the point that the cost is no longer justified. This is already the case for the imager but even more so for the Kinect sensor. In addition the color data that the Kinect captures is different to that used in the previous form of volumetric reconstruction. In the case of the imager we lose the ability to do color analysis. And for the Kinect sensor we lose the ability to process the RGB data as a true RGB volume as it is coded in such a way as to be optimised for use with the device itself (where we have no access to the data or the algorithms used to generate the data).