Episode 9: A Conversation with Soumith Chintala

In this episode, Byron and Soumith talk about transfer learning, child development, pain, neural networks, and adversarial networks.



Soumith Chintala is a Researcher at Facebook AI Research, where he works on deep learning, reinforcement learning, generative image models, agents for video games and large-scale high-performance deep learning. Prior to joining Facebook in August 2014, he worked at MuseAmi, where he built deep learning models for music and vision targeted at mobile devices.


Byron Reese: This is Voices in AI, brought to you by Gigaom. I’m Byron Reese. Today our guest is Soumith Chintala. He is an Artificial Intelligence Research Engineer over at Facebook. He holds a Master’s of Science and Computer Science from NYU. Welcome to the show, Soumith.

Soumith Chintala: Thanks, Byron. I am glad to be on the show.

So let’s start out with your background. How did you get to where you are today? I have been reading over your LinkedIn, and it’s pretty fascinating.

It’s almost accidental that I got into AI. I wanted to be an artist, more of a digital artist, and I went to intern at a visual effects studio. After the summer, I realized that I had no talent in that direction, so I instead picked something closer to where my core strength lies, which is programming.

I started working in computer vision, but just on my own in undergrad. And slowly and steadily, I got to CMU to do robotics research. But this was back in 2009, and still deep learning wasn’t really a thing, and AI wasn’t like a hot topic. I was doing stuff like teaching robots to play soccer and doing face recognition and stuff like that.

And then I applied for master’s programs at a bunch of places. I got into NYU, and I didn’t actually know what neural networks were or anything. Yann LeCun, in 2010, was more accessible than he is today, so I went, met with him, and I asked him what kind of computer vision work he could give me to do as a grad student. And he asked me if I knew what neural networks were, and I said no.

This was a stalwart in the field who I’m sitting in front of, and I’m like, “I don’t know, explain neural networks to me.” But he was very kind, and he guided me in the right direction. And I went on to work for a couple of years at NYU as a master’s student and simultaneously as a junior research scientist. I spent another year, almost a year there as a research scientist while also separately doing my startup.

I was part of a music and machine learning startup where we were trying to teach machines to understand and play music. That startup went south, and I was looking for new things. And at the same time, I’d started maintaining this tool called Torch, which was the industry-wide standard for deep learning back then. And so Yann asked me if I wanted to come to Facebook, because they were using a lot of Torch, and they wanted some experts in there.

That’s how I came about, and once I was at Facebook, I did a lot of things—research on adversarial networks, engineering, building PyTorch, etc.

Let’s go through some of that stuff. I’m curious about it. With regard to neural nets, in what way do you think they are similar to how the brain operates, and in what way are they completely different?