HPC – The Computational Foundation of Deep Learning


In this video from the 2016 Stanford HPC Conference, Brian Catanzaro from Baidu presents: HPC – The Computational Foundation of Deep Learning.

“During the past few years, Deep Learning has made incredible progress towards solving many previously difficult Artificial Intelligence tasks. Although the techniques behind deep learning have been studied for decades, they rely on large datasets and large computational resources, and so have only recently become practical for many problems. Training deep neural networks is very computationally intensive: training one of our models takes tens of exaflops of work, and so HPC techniques are key to creating these models. As in other fields, progress in artificial intelligence is iterative, building on previous ideas. This means that the turnaround time in training one of our models is a key bottleneck to progress in AI: the quicker we can realize an idea as a trainable model, train it on a large dataset, and test it, the quicker we find ways of improving our models. Accordingly, we care a great deal about scaling our model training, and in particular, we need to strongly scale the training process. In this talk, I will discuss the key insights that make deep learning work for many problems, describe the training problem, and detail our use of standard HPC techniques to allow us to rapidly iterate on our models. I will explain how HPC ideas are becoming increasingly central to progress in AI. I will also show several examples of how deep learning is helping us solve difficult AI problems.”