Episode 96: A Conversation with Gary Marcus

Byron speaks with author and psychologist Gary Marcus about the nature of intelligence and what the mind really means in relation to AI.

:: ::


GARY MARCUS is a scientist, best-selling author, and entrepreneur. He is Founder and CEO of Robust.AI, and was Founder and CEO of Geometric Intelligence, a machine learning company acquired by Uber in 2016. He is the author of five books, including The Algebraic Mind, Kluge, The Birth of the Mind, and The New York Times best seller Guitar Zero, as well as editor of The Future of the Brain and The Norton Psychology Reader.

He has published extensively in fields ranging from human and animal behavior to neuroscience, genetics, linguistics, evolutionary psychology and artificial intelligence, often in leading journals such as Science and Nature, and is perhaps the youngest Professor Emeritus at NYU. His newest book, co-authored with Ernest Davis, Rebooting AI: Building Machines We Can Trust aims to shake up the field of artificial intelligence


Byron Reese: This is Voices in AI, brought to you by GigaOm, and I’m Bryon Reese. Today our guest is Gary Marcus. He is a scientist, author, and entrepreneur. He’s a professor in the Department of Psychology at NYU. He was the founder and CEO of Geometric Intelligence, a machine learning company later acquired by Uber. He has a new company called Robust.AI and a new book called Rebooting AI, so we should have a great chat. Welcome to the show, Gary.

Gary Marcus: Thanks very much for having me.

Why is intelligence such a hard thing to define, and why is artificial intelligence artificial? Is it really intelligence, or is it just something that can mimic intelligence, or is there not a difference between those two things?

I think different people have different views about that. I’m not doctrinaire about vocabulary. I think that intelligence itself is a multidimensional variable. People want to stuff it into a single number and say your IQ is 110, or 160, or 92, or whatever it is, but there are really many things that go into natural intelligence such as the ability to solve problems you haven’t seen before, or the ability to recognize objects, or the ability to speak or to be very verbal about it. There’s many, many different dimensions to intelligence. When we talk about artificial intelligence, we’re basically talking about whether machines can do some of those things.

You’re a provocative guy with all kinds of ideas in all different areas. Talk a little bit about the mind, how you think it comes about in 30 seconds or less, please. And will artificial intelligence need to have a mind to do a lot of the things we want it to do?

The best thing I ever heard about that, short version, is Steven Pinker was on Stephen Colbert. Colbert asked him to explain the brain in five words, and he said brain cells firing patterns. That’s how our brains work is there’s a lot of neural firing, and minds emerge from the activity of those brains. We still don’t really understand what all that means. We don’t have a very good grip on what the neural processes are that give rise to basic things like speaking sentences. We have a long way to go understanding it in those terms.

I tend to take a psychologist’s perspective more than a neuroscience perspective and say the mind is all of our cognitive functions. It’s how we think and how we reason, how we understand our place in the world. Machines, if we want to get to the point where they’re trustworthy, are going to have to do many of the things that human minds do, not necessarily in identical ways. It has to be able to capture, for example, the flexibility that human minds have, such that when they encounter something they haven’t seen before, they can cope with it and not just break down.

I know you said you don’t usually approach it from neurology, but I’m fascinated by the nematode worm who’s got just a handful of neurons. People have spent so long, 20 years in the OpenWorm project, trying to model those 302 neurons to make behavior. They’re not even sure it’s even possible to do that.

Do you think we are going to have to crack that code and understand something about how the brain works before we can build truly intelligent machines, or is it like the old saw about airplanes and birds [flying differently]? They’re going to think in a way that’s alien to the way we think?

I think it’s somewhere in between, but I’m also pushing towards the psychology side. I don’t think that understanding the connectome of the human brain or all those connections is anytime soon going to really help us with AI. I do think that understanding psychology better, like how people reason about everyday objects as they navigate the world, that might actually help us.

Psychology isn’t as much of a prestige discipline, so to speak, as neuroscience. Neuroscience gets more money, gets more attention. Neuroscience will probably tell us a lot about the nature of intelligence in the long term. That could be a long term of 50 or 100 years. Meanwhile, thinking about psychology has actually led to some AI that I think really works. None of it’s what we call artificial general intelligence. Most of the AI we have doesn’t owe that much to neuroscience, and if anything, it owes something to psychology and people trying to figure out how human beings or other animals solve problems.

Yeah, I agree completely with that. I think AI tries to glom onto things like neural nets and all of that to try to give them some biological tie, but I think it’s more marketing than anything.

I was about to say exactly that. I think it’s more marketing than anything.Neural networks are very, very, loosely modeled on the brain. I’m trying to think of a metaphor. It’d be like comparing a child’s first drawing to some incredibly elaborate work of art. Okay, they’re both drawings, but they’re really not the same thing. Neural networks, for example, only have essentially one kind of neuron, which either fires or doesn’t. Biology, first of all, separates the firing neurons from the inhibiting neurons, the positive from the negatives, and then there are probably 1,000 different kinds of neurons in the brain with many different properties. The so-called neural networks that people are using don’t have any of that. We don’t really understand how the biology works, so people just ignore it. They wind up with something that is only superficially related to how that brain actually functions.

Let’s talk about consciousness. Consciousness is the experience of being you, obviously. A computer can measure temperature, but we can feel warmth. I’ve heard it described as the last great scientific question we know neither how to pose scientifically nor what the answer would look like. Do you think that’s a fair description of the problem of consciousness?

The only part I’m going to give you grief about is that it’s the last great scientific question. I mean, as you yourself said later in your question, it’s not a well-formed question. Great scientific questions are well formed. We know what an answer would look like and what a methodology would be for answering them. Maybe we lack some instrument. We can’t do it yet. We need a bigger collider or something like that where we understand the principle of how you can get data to address it. [With] consciousness, we don’t really at this point know that.

We don’t know even what a ‘consciousness meter’ would look like. If we had one, we’d go around and do a bunch of experiments and say, “Well, does this worm that you’re talking about have consciousness? Does my cat? What if I’m asleep? What if I’m in a coma?” You could start to collect data. You could build a theory around that. We don’t even know how we would collect the data.

My view is: there is something there that needs to be answered. Obviously, there is a feeling of experiencing red, or experiencing orgasm, or whatever we would describe as consciousness. We don’t have any, I think, real scientific purchase on what it is that we’re even asking. Maybe it will turn out to be the last great scientific question, but if it is, it’ll be somehow refined relative to what it is that we’re asking right now.

Do you believe that we can create a general intelligence on some time period measured in centuries, even? Do you believe it’s possible to do that?

I do, absolutely. I’m widely known as a critic of AI, but I’m only a critic of what people are doing now, which I think is misguided in certain ways. I certainly think it’s possible to build a general intelligence. You could argue on the margins. Could a machine be conscious? I would say, “Well, it depends what you mean by conscious, and I don’t know what the answer is.”

Could you build a machine that could be a much more flexible thinker than current machines? Yes, I don’t see a principled reason why you couldn’t have a machine that was as smart as MacGyver and could figure out how to get its way out of a locked room using twist ties and rubber bands or something like that, which a current machine can’t do at all. I don’t see the principled reason why computers can’t do that, and I see at least some notion of how we might move more in that direction.

The problem right now is: people are very attracted to using large databases. We’re in the era of big data, and almost all of the research is around what you can do with big data. That leads to solutions to certain kinds of problems. How do I recognize a picture and label it if I have a lot of labels from other people that have taken similar pictures? It doesn’t necessarily lead you to questions about what would I do if I had this small amount of data, and I was addressing a problem that nobody had ever seen before? That’s what humans are good at, and that’s what’s lacking from machines. This doesn’t mean it’s an unsolvable problem in principle. It means that people are chasing research dollars and salary and stuff like that for a certain set of problems that are popular right now. My view is that AI is misguided right now, but not that it’s impossible.

Is it possible that a general intelligence would have to have first-person experience in order to be truly intelligent? For it to really be a general intelligence, it would have to be creative?

You would need some kinds of creativity. You could argue about how creative the average person is, but there’s a level of creativity that even ordinary people have that machines are lacking right now. Now, you could argue that even human creativity is more algorithmic than we recognize, and if we built the right algorithms, you’d probably get machines to do the same thing.

In fact, we now have algorithmic composition, so machines can make up music to some extent. They probably do it better than you and I can. I don’t know your musical background, no offense, but not as well as Paul McCartney could. He’s got a kind of creativity that’s outside of the bounds of what ordinary people can do.

Machines can now make some things that at least ostensibly look creative. Another example is AlphaGo making chess moves that humans hadn’t considered. It’s not the apotheosis maybe of creativity. But the machine is looking at a set of possibilities that people haven’t evaluated, and it comes up with a good evaluation. It appears to externalize as that’s created. Maybe a lot of human creativity is the same thing. Somebody explores a space that nobody had done before. We call that creative.

I wonder, I mean, because building a general intelligence that would write the Harry Potter series—I mean, you and I both know why computers can “compose music.” I mean, it’s this really narrow…

Within limits, right?

It’s a rigged thing that—I’m not saying they’re rigging it, but it’s like what would be a problem that we could use that would imply—I know. We’ll get it to blank, and it can mimic creativity. Is that the same as being creative, or is mimicking creativity all creativity is?

I think we don’t fully know the answer, but I would say that you could at least distinguish different kinds of creativity. You can get a machine to compose in the style of Bach. It’s very hard to get a machine to come up with something that nobody has come up with before. Now, most people can’t do that either.

I wrote a book called Guitar Zero about learning to play guitar at the age of 40. There was a quote. I think it was Steve Vai said about Jimi Hendrix. “I can play every note exactly the way he did, but I can never figure out how he decided to play it in the first place.” You’ve got the Hendrix level of creativity where somebody just thinks of something that wasn’t even on anybody else’s mind, and then you have the second tier of creativity, which is like the average person pushing out a song. Machines can go somewhere in that scale. They can’t go to the Hendrix side, I mean, maybe eventually but not now, sorry.

Right, do you remember I, Robot where Sonny the robot is talking to the Will Smith character who’s anti-robot? He says “Can a robot write a symphony? Can a robot paint a masterpiece?” Sonny answered, “well, can you?” which it sounds like is what you’re saying.

Tell me this. I’m of the opinion that the number of people actually working on general intelligence—the number of groups of people working on general intelligence is under a dozen. Would you agree?

The number of groups or individuals? The number of groups…

No, groups. You say OpenAI, Carnegie Mellon, Google. If you looked at where the dollars all go, 99% of all the money goes into what you were just saying. It’s just really this trick of studying lots of data, looking for patterns, and making projections into the future.

I mean, I would even argue that a place like OpenAI is mostly doing that, even more extreme. I think there are some people at DeepMind who certainly care about artificial general intelligence. There are some people at OpenAI, but I think you’re exactly right. The large majority of the effort is really what can we do with the current tools? Some people have come up with really clever things to do with current tools like colorizing old film, old black and white film, for example. It’s a neat way of reapplying these statistical tools.

The number of people that are working on questions like ‘What is it to understand what’s going on in the conversation?’ is very, very small, and it might well be less than a dozen. You’re not really going to get to artificial general intelligence if you have machines that can’t really follow a conversation and aren’t really even trying.

Current language techniques are mostly illusions. When you say to Alexa “Can you turn on the lights?” they’ve got a template for that, and it’s built in what they should do for it. It’s not the same thing as a machine having an understanding that you might be sitting in a dark room and would prefer light. I mean, there’s no depth to the ways that these systems are set up, and that depth is a prerequisite to what I would call artificial general intelligence.

Yeah I wrote an article about how the Alexa and Google Home answered the same exact questions differently, and they were questions you would think would [have] the same [answer]. Who designed the American Flag, or how many minutes are in a year? They gave different answers. The reason is because [for] who designed the American Flag, one said Robert Heft and one said Betsy Ross. Heft designed the 50 star configuration. Likewise, with the minutes in our year, it boiled down to one was doing a solar year and one was doing a calendar year, and so you’re right. A human would say, “Well, what do you mean by a year exactly?” A human understands the question, so [the digital assistants] can’t resolve ambiguity.

They have trouble with ambiguity. They have trouble with discourse too. Discourse is about having back and forth. Trying to refine: how did you get to that answer? What would it be like if we changed this assumption in the kind of conversation that we’re having right now? These systems don’t do that. There are other giveaways. We talk a lot about this—Ernie Davis and I do it in this new book, Rebooting AI, that’s coming out in September. We go through a lot of examples of basic things that you’d expect [machines] to be able to do, and they can’t really—like just synthesize a bunch of data across a bunch of different web pages.

If I ask you which Supreme Court justice served the longest, if that isn’t on a single specific webpage and you have to compile it across ten pages, forget it. The systems aren’t going to be able to do that because they don’t really understand the concepts. It’s one thing to match a keyword or set of words, and maybe you’ll get a hit of exactly or almost exactly the question that I’m asking. If it requires a machine to put information together that hasn’t by chance been put together before, they can’t really do that because they don’t know what any of it means.

You brought up Rebooting AI. Tell us, what’s the thesis of the book, and why did you and your coauthor decide to write it?

The thesis of the book is that AI has headed off—spiraled out of control in the wrong direction. There’s a ton of hype for it. It actually works better than it ever has before, but the direction is all about this very shallow superficial statistical learning.

It so happens that there’s a technique called deep learning that people are very excited about, but deep learning is not actually deep except in a very narrow technical sense, which is how many layers in one of these neural networks. It’s not deep in the sense of really understanding things. We wrote some pieces together, New York Times op-eds and stuff like that. We’re pretty widely read, pretty visible, and people notice them.

We couldn’t really lay out the argument in depth, and so we decided that we were going to take some case studies. Language and robotics are really the ones that we developed the best and worked through why there’s what some people might call an impedance mismatch: a mismatch between what current machines can do and what we really would need for a genuine AI to be able to solve these problems.

Then the rest of the argument is essentially about why, if we use the techniques that we have now, we’re in trouble. We’re in different kinds of trouble. Driverless cars are unlikely to be reliable enough to be able to be used generally. You might be able to use them in a specific route in particular weather, but if you want them to be able to go from any destination, any point A to point B, they’re not likely to be reliable enough because the representations are too superficial; same thing with getting machines to really understand even basic stories.

We go through a children’s story at length and show how much inference, how much reasoning and thought an ordinary person is doing reading a story written for 9-year-olds and how far away that is from what we’ve got now, which means: if you want a system to read things for you, you can’t count on it actually understanding what’s going on, so it can be easily led astray.

Now we have literature about adversarial examples, for example, systems getting fooled by the things that they see visually, but that extends to the things they read and so forth.We wind up now where we have these AI systems that people are using that are relatively easy to construct. We have the data, and so people are tempted into using them. Then they have all kinds of biases. They’re not reliable. There’s a difference between getting something 80% correct if it’s an AI system to recommend a book or an advertisement and getting an AI system to be 80% correct if it’s driving your car or taking care of your grandfather. Like the example of: if you had an eldercare robot and it’s 90% correct, lifts you grandfather into bed 90% of the time correctly, that’s just not acceptable. You can’t drop grandpa one time out of ten.

The point of the book is if we use these techniques that we have now we are in trouble, and we really, really need to reshape the enterprise, even though people are so excited about it. I’ll just say one more thing. It’s also a guide of how to be skeptical. There’s an old book that we took some inspiration from called How to Lie with Statistics, which you may have seen once upon a time. There’s a little element of that here. We’re trying to teach people how to not get sucked in by all the hype in AI.

I agree that the philosophical underpinning of what we do now is we take large datasets. We look for patterns. We make projections into the future, and that is a shallow thing, as you point out.

What do you suggest as a different methodology entirely? At the Allen Institute, Oren Etzioni, they’re trying to make an AI that can pass sixth grade science exams. They’re really, I think, trying to do that kind of understanding.

Partly because of my own instigation, I used to talk fairly often to Paul Allen, and I talked to Oren quite often, and they are focusing on commonsense reasoning as well. That’s a lot of what we emphasize in the book is commonsense. We have a chapter about what that even means. We have a chapter about lessons from how human cognition works that try to paint what we think is the least that needs to be built in.

Another thing we haven’t talked about per se is people talk a lot about machine learning right now. It’s great to have machines learn things, but they need some things built in as well in order to start. There’s a pendulum, a nature/nurture pendulum in AI as there have been in many other fields historically. Right now, that pendulum is all the way on the side of ‘let’s learn everything from scratch.’

A perfect example of that is the Atari game system that DeepMind built that allowed them to sell themselves to Google. That system knew nothing about games except it saw pixels on a screen, and it knew how to move the joy sticks. It had commands built in for left, right, up, down, press the fire button. It learned everything it needed to do to, let’s say play Breakout, without anything built in about what an object is or what space or time is and so forth. The catch is: it doesn’t really have a deep representation. You can watch it break through a wall. You’re like, ‘wow, it’s learned the whole concept of the game,’ but if you move the paddle three pixels, the whole thing falls apart. It’s again, incredibly superficial, and that’s because it’s starting too much from scratch.

The part of what we were arguing is you need some things built in. We take a perspective that really goes back to Plato and especially Immanuel Kant, which is: you need to be born, essentially, with sense of space and time and objects and so forth. Liz Spelke, the developmental psychologist at Harvard, has made similar arguments with respect to human children. We’re saying, “Hey, let’s look at these things that human children, and in fact, other animals start with, and maybe we need to craft those into our machines before we set them for the job of learning so that what they learn is much richer and more sophisticated.”

If I were to pose a complex question that I would want to ask a computer and the question would be:“Dr. Smith is eating at his favorite restaurant when he receives a phone call; glancing down at his phone—or answering the call, he looks worried, gets up, runs out the door without paying his bill, are the owners likely to prosecute him?” The person says, “Well, it’s his favorite restaurant. They probably know him. He’s a doctor. He just got some call, and he’ll just settle up the next time he comes in, so no, they aren’t going to call the police.”

A person could iterate on that too. You could add a few extra facts, and then maybe we would change our minds. You can reason about it. You could say, well, what is the relevance of how often he’s been to that particular restaurant? How is he dressed? Did he smile at anybody on the way out? You can reason as the facts go back and forth, right? When we go to a legal case, the details of the facts matter a lot as we try to reason about what people’s motivations were, what their alternatives were, what the set of possibilities were and so forth. Machines can’t do that at all right now.

How do you solve that problem? A few episodes ago, we had Doug Lenat with the Cyc project. They tried to instantiate every—they tried to build a model of the universe, as it were, hierarchical. It’s a lifetime of work. Is that what you do, or how do you solve that problem?

I’m a big fan of what Doug tried to do. I would do it differently if I were doing it now. You have to remember he started in the late 1980s, but I think that the spirit of what he was trying to do was right. We need a large database of machine interpretable knowledge. Now, he tried to hand code it all, hiring a lot of philosophers, teaching them to code that knowledge in nuanced and sophisticated ways, and it was almost all stated in logic.

If we were doing things nowadays, we would probably want much greater room for probability. Not everything fits neatly into a binary distinction. We would probably want to learn a lot of that knowledge rather than trying to hand code it. It turns out, even after 30 years, he probably hasn’t hand coded enough knowledge to work in a typical situation, so there needs to be some machine learning component—not necessarily the kind of machine learning that’s popular in the field right now, but some kind of ways for algorithms to learn from data.

I think the broad thrust of what he was trying to do is right. Metaphor is sometimes used as—I think Doug Lenat was trying to get across the right mountain and picked the wrong path. Most of the field right now isn’t even trying to get across the right mountain, and they’re suffering as a consequence. They may not even see the suffering, but the reality is, when they try to attack some of the problems that motivated Doug Lenat, they’re just not really doing it at all. What’s happened is the questions that used to be popular are just ignored right now.

Peter Norvig, who’s a very well-known AI researcher at Google, did his dissertation on story understanding. People used to try to solve that problem. Roger Schank was famously involved in it, and it’s not like that problem was solved. We still don’t know how to have a machine understand the story.

It’s that people moved on to other problems. How do I match keywords at scale for the web? There’s a lot of money to be made. It was very useful. There’s nothing wrong with their having done that, but these core problems have not been solved. I don’t see how to solve them without something that at least has the spirit of what Lenat was trying to accomplish.

Yeah, it’s just that industry, enterprises, if there were no more AI developments from this moment on, it would probably take ten years to just take the simple technique we know now and apply them to all of the real problems they could solve in enterprises. How do I make my trucks use 10% less fuel or predict when my machine’s going to breakdown with 10% more accuracy, or where do I deadhead my planes overnight? All of these are the kinds of questions—it’s like, for the foreseeable future, all the money is going to go into this technique we know that can give measurable results. Who’s the person, or what’s the group? Is it a university that’s going to have to start over with this…?

That brings us in a way to why I built this company, just launched this company with four other people including Rodney Brooks called Robust.AI. Part of our view, Rod and I—he was one of the people who invented the Roomba robot. He’s a leading light in robotics. He and I came from different perspectives but to the same place. He’s a roboticist, and he’s been very frustrated with the state of that field. I’m a cognitive scientist, and I’m very frustrated with the way that psychology has been neglected from AI. We both look at deep learning and say, “yeah, all these statistics are nice, but they’re not rich enough.”

We both realized that in robotics, the rubber meets the road. You can’t fake things with statistics because the environment changes too much. If you wanted to build something like Rosie the Robot, for example, you’d have to have a system that is flexible enough to deal with changes in the environment, deal with the unexpected. It’s a very different situation from, say an assembly line, where you might pack a million iPhones into a million boxes, and it’s always the same. We think that robotics is a good field to push on making a better quality AI. It keeps you honest in a way that advertising recommendation just doesn’t.

That’s pretty exciting. He pioneered the concept of “the juice,” right? He said that if you lock an animal in a box, it desperately tries to get out. If you put a robot in a box, it just goes through some protocol and that difference between—he doesn’t think it’s anything supernatural or anything like that. There’s some essence that the animal has that it’s trying to solve this problem that a robot just simply sequentially [is] going through a bunch of choices [to reach a decision]. Tell me more about Robots AI...

It’s Robust, R-O-B-U-S-T.

Oh, I am sorry. We’ll make sure we link to it correctly, Robust.AI.

Robust.AI and the name is pointing out both a goal of our company, and it’s a little bit of a dig on the field as a whole. Most solutions in AI right now are not robust. They’re brittle. That’s the opposite of robust. They work on some very narrow set of circumstances, and then you change something. They don’t work anymore.

I mentioned the Atari game system of DeepMind. That’s an example. It works if the paddle is exactly in the place where it’s been in your training data, but it doesn’t work if the paddle moves up a few pixels. We’re trying at Robust.AI and we just launched to build solutions to robotics that are more robust.

There’s a culture of demos in robotics right now. You show a robot doing a backflip, and you make 20 videos of it. You show the one case where it really worked. We want to build robots that work 20 out of 20 cases, and that’s going to require a different conception about how you build the software. We are building a fundamentally new (we hope) industrial grade software stack to support all kinds of different robotic activities and make them more robust.

That is ambitious. Where are you at in that endeavor? You just started the company. What’s your timeline? What are you going to try to build and when and all of that?

We hope that we will first be showing a select group of people what we’ve got in mind let’s say a year from now. We have five founders. We just started hiring people. I’m imagining this is going to air a little bit after we have the conversation, but we are already doing well in hiring. We raised a very large seed round, so we have plenty of money to get started. We’ve set up offices in California, and we’re excited to go and full of ideas. Rod is writing first drafts of code, and it’s exciting.

All right, well, I see we’re coming up on time here. You’re clearly a fascinating guy. Where do people go to keep up with what your mind is up to?

I suppose if I were a better person I would update garymarcus.com more regularly. I should update it very soon, and then two websites soon to be available will be Rebooting.AI, which will be tied to the book that we mentioned, and Robust.AI, which is where the company can be found. And @garymarcus at Twitter.

All right, well, thank you for being on the show. Come back anytime you want, and we’ll pick the conversation up.