Episode 12: A Conversation with Scott Clark

In this episode, Byron and Scott talk about algorithms, transfer learning, human intelligence, and pain and suffering.

-
-
0:00
0:00
0:00

Guest

Scott is co-founder and CEO of SigOpt, a YC and a16z backed “Optimization as a Service” startup in San Francisco that helps firms tune their ML and AI pipelines. Scott has been applying optimal learning techniques in industry and academia for years, from bioinformatics to production advertising systems. Before SigOpt, Scott worked on the Ad Targeting team at Yelp leading the charge on academic research and outreach with projects like the Yelp Dataset Challenge and open sourcing MOE. Scott holds a PhD in Applied Mathematics and an MS in Computer Science from Cornell University and BS degrees in Mathematics, Physics, and Computational Physics from Oregon State University. Scott was chosen as one of Forbes’ 30 under 30 in 2016.

Transcript

Byron Reese: This is Voices in AI, brought to you by Gigaom. I’m Byron Reese. Today our guest is Scott Clark. He is the CEO and co-founder of SigOpt. They’re a SaaS startup for tuning complex systems and machine learning models. Before that, Scott worked on the ad targeting team at Yelp, leading the charge on academic research and outreach. He holds a PhD in Applied Mathematics and an MS in Computer Science from Cornell, and a BS in Mathematics, Physics, and Computational Physics from Oregon State University. He was chosen as one of Forbes 30 under 30 in 2016. Welcome to the show, Scott.

Scott Clark: Thanks for having me.

I’d like to start with the question, because I know two people never answer it the same: What is artificial intelligence?

I like to go back to an old quote… I don’t remember the attribution for it, but I think it actually fits the definition pretty well. Artificial intelligence is what machines can’t currently do. It’s the idea that there’s this moving goalpost for what artificial intelligence actually means. Ten years ago, artificial intelligence meant being able to classify images; like, can a machine look at a picture and tell you what’s in the picture?

Now we can do that pretty well. Maybe twenty, thirty years ago, if you told somebody that there would be a browser where you can type in words, and it would automatically correct your spelling and grammar and understand language, he would think that’s artificial intelligence. And I think there’s been a slight shift, somewhat recently, where people are calling deep learning artificial intelligence and things like that.

It’s got a little bit conflated with specific tools. So now people talk about artificial general intelligence as this impossible next thing. But I think a lot of people, in their minds, think of artificial intelligence as whatever it is that’s next that computers haven’t figured out how to do yet, that humans can do. But, as computers continually make progress on those fronts, the goalposts continually change.

I’d say today, people think of it as conversational systems, basic tasks that humans can do in five seconds or less, and then artificial general intelligence is everything after that. And things like spell check, or being able to do anomaly detection, are just taken for granted and that’s just machine learning now.

I’ll accept all of that, but that’s more of a sociological observation about how we think of it, and then actually… I’ll change the question. What is intelligence?

That’s a much more difficult question. Maybe the ability to reason about your environment and draw conclusions from it.

Do you think that what we’re building, our systems, are they artificial in the sense that we just built them, but they can do that? Or are they artificial in the sense that they can’t really do that, but they sure can think it well?

I think they’re artificial in the sense that they’re not biological systems. They seem to be able to perceive input in the same way that a human can perceive input, and draw conclusions based off of that input. Usually, the reward system in place in an artificial intelligence framework is designed to do a very specific thing, very well.

So is there a cat in this picture or not? As opposed to a human: It’s, “Try to live a fulfilling life.” The objective functions are slightly different, but they are interpreting outside stimuli via some input mechanism, and then trying to apply that towards a specific goal. The goals for artificial intelligence today are extremely short-term, but I think that they are performing them on the same level—or better sometimes—than a human presented with the exact same short-term goal.

The artificial component comes into the fact that they were constructed, non-biologically. But other than that, I think they meet the definition of observing stimuli, reasoning about an environment, and achieving some outcome.

You used the phrase ‘they draw conclusions’. Are you using that colloquially, or does the machine actually conclude? Or does it merely calculate?

It calculates, but then it comes to, I guess, a decision at the end of the day. If it’s a classification system, for example… going back to “Is there a cat in this picture?” It draws the conclusion that “Yes, there was a cat. No, that wasn’t a cat.” It can do that with various levels of certainty in the same way that, potentially, a human would solve the exact same problem. If I showed you a blurry Polaroid picture you might be able to say, “I’m pretty sure there’s a cat in there, but I’m not 100 percent certain.”

And if I show you a very crisp picture of a kitten, you could be like, “Yes, there’s a cat there.” And I think convolutional neural network is doing the exact same thing: taking in that outside stimuli. Not through an optical nerve, but through the raw encoding of pixels, and then coming to the exact same conclusion.

You make the really useful distinction between an AGI, which is a general intelligence—something as versatile as a human—and then the kinds of stuff we’re building now, which we call AI—which is doing this reasoning or drawing conclusions.

Is an AGI a linear development from what we have now? In other words, do we have all the pieces, and we just need faster computers, better algorithms, more data, a few nips and tucks, and we’re eventually going to get an AGI? Or is an AGI something very different, that is a whole different ball of wax?

I’m not convinced that, with the current tooling we have today, that it’s just like… if we add one more hidden layer to a neural network, all of a sudden it’ll be AGI. That being said, I think this is how science and computer science and progress in general works. Is that techniques are built upon each other, we make advancements.

It might be a completely new type of algorithm. It might not be a neural network. It might be reinforcement learning. It might not be reinforcement learning. It might be the next thing. It might not be on a CPU or a GPU. Maybe it’s on a quantum computer. If you think of scientific and technological process as this linear evolution of different techniques and ideas, then I definitely think we are marching towards that as an eventual outcome.

That being said, I don’t think that there’s some magic combinatorial setting of what we have today that will turn into this. I don’t think it’s one more hidden layer. I don’t think it’s a GPU that can do one more teraflop—or something like that—that’s going to push us over the edge. I think it’s going to be things built from the foundation that we have today, but it will continue to be new and novel techniques.

There was an interesting talk at the International Conference on Machine Learning in Sydney last week about AlphaGo, and how they got this massive speed-up when they put in deep learning. They were able to break through this plateau that they had found in terms of playing ability, where they could play at the amateur level.

And then once they started applying deep learning networks, that got them to the professional, and now best-in-the-world level. I think we’re going to continue to see plateaus for some of these current techniques, but then we’ll come up with some new strategy that will blast us through and get to the next plateau. But I think that’s an ever-stratifying process.

To continue on that vein… When in 1955, they convened in Dartmouth and said, “We can solve a big part of AI in the summer, with five people,” the assumption was that general intelligence, like all the other sciences, had a few simple laws.

You had Newton, Maxwell; you had electricity and magnetism, and all these things, and they were just a few simple laws. The idea was that all we need to do is figure out those for intelligence. And Pedro Domingos argues in The Master Algorithm, from a biological perspective that, in a sense, that may be true.  

That if you look at the DNA difference between us and an animal that isn’t generally intelligent… the amount of code is just a few megabytes that’s different, which teaches how to make my brain and your brain. It sounded like you were saying, “No, there’s not going to be some silver bullet, it’s going to be a bunch of silver buckshot and we’ll eventually get there.”

But do you hold any hope that maybe it is a simple and elegant thing?

Going back to my original statement about what is AI, I think when Marvin Minsky and everybody sat down in Dartmouth, the goalposts for AI were somewhat different. Because they were attacking it for the first time, some of the things were definitely overambitious. But certain things that they set out to do that summer, they actually accomplished reasonably well.

Things like the Lisp programming language, and things like that, came out of that and were extremely successful. But then, once these goals are accomplished, the next thing comes up. Obviously, in hindsight, it was overambitious to think that they could maybe match a human, but I think if you were to go back to Dartmouth and show them what we have today, and say: “Look, this computer can describe the scene in this picture completely accurately.”

I think that could be indistinguishable from the artificial intelligence that they were seeking, even if today what we want is someone we can have a conversation with. And then once we can have a conversation, the next thing is we want them to be able to plan our lives for us, or whatever it may be, solve world peace.

While I think there are some of the fundamental building blocks that will continue to be used—like, linear algebra and calculus, and things like that, will definitely be a core component of the algorithms that make up whatever does become AGI—I think there is a pretty big jump between that. Even if there’s only a few megabytes difference between us and a starfish or something like that, every piece of DNA is two bits.

If you have millions of differences, four-to-the-several million—like the state space for DNA—even though you can store it in a small amount of megabytes, there are so many different combinatorial combinations that it’s not like we’re just going to stumble upon it by editing something that we currently have.

It could be something very different in that configuration space. And I think those are the algorithmic advancements that will continue to push us to the next plateau, and the next plateau, until eventually we meet and/or surpass the human plateau.

You invoked quantum computers in passing, but putting that aside for a moment… Would you believe, just at a gut level—because nobody knows—that we have enough computing power to build an AGI, we just don’t know how?

Well, in the sense that if the human brain is general intelligence, the computing power in the human brain, while impressive… All of the computers in the world are probably better at performing some simple calculations than the biological gray matter mess that exists in all of our skulls. I think the raw amount of transistors and things like that might be there, if we had the right way to apply them, if they were all applied in the same direction.

That being said… Whether or not that’s enough to make it ubiquitous, or whether or not having all the computers in the world mimic a single human child will be considered artificial general intelligence, or if we’re going to need to apply it to many different situations before we claim victory, I think that’s up for semantic debate.

Do you think about how the brain works, even if [the context] is not biological? Is that how you start a problem: “Well, how do humans do this?” Does that even guide you? Does that even begin the conversation? And I know none of this is a map: Birds fly with wings, and airplanes, all of that. Is there anything to learn from human intelligence that you, in a practical, day-to-day sense, use?

Yeah, definitely. I think it often helps to try to approach a problem from fundamentally different ways. One way to approach that problem is from the purely mathematical, axiomatic way; where we’re trying to build up from first principles, and trying to get to something that has a nice proof or something associated with it.

Another way to try to attack the problem is from a more biological setting. If I had to solve this problem, and I couldn’t assume any of those axioms, then how would I begin to try to build heuristics around it? Sometimes you can go from that back to the proof, but there are many different ways to attack that problem. Obviously, there are a lot of things in computer science, and optimization in general, that are motivated by physical phenomena.

So a neural network, if you squint, looks kind of like a biological brain neural network. There’s things like simulated annealing, which is a global optimization strategy that mimics the way that like steel is annealed… where it tries to find some local lattice structure that has low energy, and then you pound the steel with the hammer, and that increases the energy to find a better global optima lattice structure that is harder steel.

But that’s also an extremely popular algorithm in the scientific literature. So it was come to from this auxiliary way, or a genetic algorithm where you’re slowly evolving a population to try to get to a good result. I think there is definitely room for a lot of these algorithms to be inspired by biological or physical phenomenon, whether or not they are required to be from that to be proficient. I would have trouble, off the top of my head, coming up with the biological equivalent for a support vector machine or something like that. So there’s two different ways to attack it, but both can produce really interesting results.

Let’s take a normal thing that a human does, which is: You show a human training data of the Maltese Falcon, the little statue from the movie, and then you show him a bunch of photos. And a human can instantly say, “There’s the falcon under water, and there it’s half-hidden by a tree, and there it’s upside down…” A human does that naturally. So it’s some kind of transferred learning. How do we do that?

Transfer learning is the way that that happens. You’ve seen trees before. You’ve seen water. You’ve seen how objects look inside and outside of water before. And then you’re able to apply that knowledge to this new context.

It might be difficult for a human who grew up in a sensory deprivation chamber to look at this object… and then you start to show them things that they’ve never seen before: “Here’s this object and a tree,” and they might not ‘see the forest for the trees’ as it were.

In addition to that, without any context whatsoever, you take someone who was raised in a sensory deprivation chamber, and you start showing them pictures and ask them to do classification type tasks. They may be completely unaware of what’s the reward function here. Who is this thing telling me to do things for the first time I’ve never seen before?

What does it mean to even classify things or describe an object? Because you’ve never seen an object before.

And when you start training these systems from scratch, with no previous knowledge, that’s how they work. They need to slowly learn what’s good, what’s bad. There’s a reward function associated with that.

But with no context, with no previous information, it’s actually very surprising how well they are able to perform these tasks; considering [that when] a child is born, four hours later it isn’t able to do this. A machine algorithm that’s trained from scratch over the course of four hours on a couple of GPUs is able to do this.

You mentioned the sensory deprivation chamber a couple of times. Do you have a sense that we’re going to need to embody these AIs to allow them to—and I use the word very loosely—‘experience’ the world? Are they locked in a sensory deprivation chamber right now, and that’s limiting them?

I think with transfer learning, and pre-training of data, and some reinforcement algorithm work, there’s definitely this idea of trying to make that better, and bootstrapping based off of previous knowledge in the same way that a human would attack this problem. I think it is a limitation. It would be very difficult to go from zero to artificial general intelligence without providing more of this context.

There’s been many papers recently, and OpenAI had this great blog post recently where, if you teach the machine language first, if you show it a bunch of contextual information—this idea of this unsupervised learning component of it, where it’s just absorbing information about the potential inputs it can get—that allows it to perform much better on a specific task, in the same way that a baby absorbs language for a long time before it actually starts to produce it itself.

And it could be in a very unstructured way, but it’s able to learn some of the actual language structure or sounds from the particular culture in which it was raised in this unstructured way.

Let’s talk a minute about human intelligence. Why do you think we understand so poorly how the brain works?

That’s a great question. It’s easier scientifically, with my background in math and physics—it seems like it’s easier to break down modular decomposable systems. Humanity has done a very good job at understanding, at least at a high level, how physical systems work, or things like chemistry.

Biology starts to get a little bit messier, because it’s less modular and less decomposable. And as you start to build larger and larger biological systems, it becomes a lot harder to understand all the different moving pieces. Then you go to the brain, and then you start to look at psychology and sociology, and all of the lines get much fuzzier.

It’s very difficult to build an axiomatic rule system. And humans aren’t even able to do that in some sort of grand unified way with physics, or understand quantum mechanics, or things like that; let alone being able to do it for these sometimes infinitely more complex systems.

Right. But the most successful animal on the planet is a nematode worm. Ten percent of all animals are nematode worms. They’re successful, they find food, and they reproduce and they move. Their brains have 302 neurons. We’ve spent twenty years trying to model that, a bunch of very smart people in the OpenWorm project…

 But twenty years trying to model 300 neurons to just reproduce this worm, make a digital version of it, and even to this day people in the project say it may not be possible.

I guess the argument is, 300 sounds like a small amount. One thing that’s very difficult for humans to internalize is the exponential function. So if intelligence grew linearly, then yeah. If we could understand one, then 300 might not be that much, whatever it is. But if the state space grows exponentially, or the complexity grows exponentially… if there’s ten different positions for every single one of those neurons, like 10300, that’s more than the number of atoms in the universe.

Right. But we aren’t starting by just rolling 300 dice and hoping for them all to be—we know how those neurons are arranged.

At a very high level we do.

I’m getting to a point, that we maybe don’t even understand how a neuron works. A neuron may be doing stuff down at the quantum level. It may be this gigantic supercomputer we don’t even have a hope of understanding, a single neuron.

From a chemical way, we can have an understanding of, “Okay, so we have neurotransmitters that carry a positive charge, that then cause a reaction based off of some threshold of charge, and there’s this catalyst that happens.” I think from a physics and chemical understanding, we can understand the base components of it, but as you start to build these complex systems that have this combinatorial set of states, it does become much more difficult.

And I think that’s that abstraction, where we can understand how simple chemical reactions work. But then it becomes much more difficult once you start adding more and more. Or even in physics… like if you have two bodies, and you’re trying to calculate the gravity, that’s relatively easy. Three? Harder. Four? Maybe impossible. It becomes much harder to solve these higher-order, higher-body problems. And even with 302 neurons, that starts to get pretty complex.

Oddly, two of them aren’t connected to anything, just like floating out there…

Do you think human intelligence is emergent?

In what respect?

I will clarify that. There are two sorts of emergence: one is weak, and one is strong. Weak emergence is where a system takes on characteristics which don’t appear at first glance to be derivable from them. So the intelligence displayed by an ant colony, or a beehive—the way that some bees can shimmer in unison to scare off predators. No bee is saying, “We need to do this.”  

The anthill behaves intelligently, even though… The queen isn’t, like, in charge; the queen is just another ant, but somehow it all adds intelligence. So that would be something where it takes on these attributes.

Can you really intuitively derive intelligence from neurons?

And then, to push that a step further, there are some who believe in something called ‘strong emergence’, where they literally are not derivable. You cannot look at a bunch of matter and explain how it can become conscious, for instance. It is what the minority of people believe about emergence, that there is some additional property of the universe we do not understand that makes these things happen.

The question I’m asking you is: Is reductionism the way to go to figure out intelligence? Is that how we’re going to kind of make advances towards an AGI? Just break it down into enough small pieces.

I think that is an approach, whether or not that’s ‘the’ ultimate approach that works is to be seen. As I was mentioning before, there are ways to take biological or physical systems, and then try to work them back into something that then can be used and applied in a different context. There’s other ways, where you start from the more theoretical or axiomatic way, and try to move forward into something that then can be applied to a specific problem.

I think there’s wide swaths of the universe that we don’t understand at many levels. Mathematics isn’t solved. Physics isn’t solved. Chemistry isn’t solved. All of these build on each other to get to these large, complex, biological systems. It may be a very long time, or we might need an AGI to help us solve some of these systems.

I don’t think it’s required to understand everything to be able to observe intelligence—like, proof by example. I can’t tell you why my brain thinks, but my brain is thinking, if you can assume that humans are thinking. So you don’t necessarily need to understand all of it to put it all together.

Let me ask you one more far-out question, and then we’ll go to a little more immediate future. Do you have an opinion on how consciousness comes about? And if you do or don’t, do you believe we’re going to build conscious machines?

Even to throw a little more into that one, do you think consciousness—that ability to change focus and all of that—is a requisite for general intelligence?

So, I would like to hear your definition of consciousness.

I would define it by example, to say that it’s subjective experience. It’s how you experience things. We’ve all had that experience when you’re driving, that you kind of space out, and then, all of a sudden, you kind of snap to. “Whoa! I don’t even remember getting here.”

And so that time when you were driving, your brain was elsewhere, you were clearly intelligent, because you were merging in and out of traffic. But in the sense I’m using the word, you were not ‘conscious’, you were not experiencing the world. If your foot caught on fire, you would feel it; but you weren’t experiencing the world. And then instantly, it all came on and you were an entity that experienced something.

Or, put another way… this is often illustrated with the problem of Mary by Frank Jackson:

He offers somebody named Mary, who knows everything about color, like, at a god-like level—knows every single thing about color. But the catch is, you might guess, she’s never seen it. She’s lived in a room, black-and-white, never seen it [color]. And one day, she opens the door, she looks outside and she sees red.  

The question becomes: Does she learn anything? Did she learn something new?  

In other words, is experiencing something different than knowing something? Those two things taken together, defining consciousness, is having an experience of the world…

I’ll give one final one. You can hook a sensor up to a computer, and you can program the computer to play an mp3 of somebody screaming if the sensor hits 500 degrees. But nobody would say, at this day and age, the computer feels the pain. Could a computer feel anything?

Okay. I think there’s a lot to unpack there. I think computers can perceive the environment. Your webcam is able to record the environment in the same way that your optical nerves are able to record the environment. When you’re driving a car, and daydreaming, and kind of going on autopilot, as it were, there still are processes running in the background.

If you were to close your eyes, you would be much worse at doing lane merging and things like that. And that’s because you’re still getting the sensory input, even if you’re not actively, consciously aware of the fact that you’re observing that input.

Maybe that’s where you’re getting at with consciousness here, is: Not only the actual task that’s being performed, which I think computers are very good at—and we have self-driving cars out on the street in the Bay Area every day—but that awareness of the fact that you are performing this task, is kind of meta-level of: “I’m assembling together all of these different subcomponents.”

Whether that’s driving a car, thinking about the meeting that I’m running late to, some fight that I had with my significant other the night before, or whatever it is. There’s all these individual processes running, and there could be this kind of global awareness of all of these different tasks.

I think today, where artificial intelligence sits is, performing each one of these individual tasks extremely well, toward some kind of objective function of, “I need to not crash this car. I need to figure out how to resolve this conflict,” or whatever it may be; or, “Play this game in an artificial intelligence setting.” But we don’t yet have that kind of governing overall strategy that’s aware of making these tradeoffs, and then making those tradeoffs in an intelligent way. But that overall strategy itself is just going to be going toward some specific reward function.

Probably when you’re out driving your car, and you’re spacing out, your overall reward function is, “I want to be happy and healthy. I want to live a meaningful life,” or something like that. It can be something nebulous, but you’re also just this collection of subroutines that are driving towards this specific end result.

But the direct question of what would it mean for a computer to feel pain? Will a computer feel pain? Now they can sense things, but nobody argues they have a self that experiences the pain. It matters, doesn’t it?

It depends on what you mean by pain. If you mean there’s a response of your nervous system to some outside stimuli that you perceive as pain, a negative response, and—

—It involves emotional distress. People know what pain is. It hurts. Can a computer ever hurt?

It’s a fundamentally negative response to what you’re trying to achieve. So pain and suffering is the opposite of happiness. And your objective function as a human is happiness, let’s say. So, by failing to achieve that objective, you feel something like pain. Evolutionarily, we might have evolved this in order to avoid specific things. Like, you get pain when you touch flame, so don’t touch flame.

And the reason behind that is biological systems degrade in high-temperature environments, and you’re not going to be able to reproduce or something like that.

You could argue that when a classification system fails to classify something, and it gets penalized in its reward function, that’s the equivalent of it finding something where, in its state of the world, it has failed to achieve its goal, and it’s getting the opposite of what its purpose is. And that’s similar to pain and suffering in some way.

But is it? Let’s be candid. You can’t take a person and torture them, because that’s a terrible thing to do… because they experience pain. [Whereas if] you write a program that has an infinite loop that causes your computer to crash, nobody’s going to suggest you should go to jail for that. Because people know that those are two very different things.

It is a negative neurological response based off of outside stimuli. A computer can have a negative response, and perform based off of outside stimuli poorly, relative to what it’s trying to achieve… Although I would definitely agree with you that that’s not a computer experiencing pain.

But from a pure chemical level, down to the algorithmic component of it, they’re not as fundamentally different… that because it’s a human, there’s something magic about it being a human. A dog can also experience pain.

These worms—I’m not as familiar with the literature on that, but [they] could potentially experience pain. And as you derive that further and further back, you might have to bend your definition of pain. Maybe they’re not feeling something in a central nervous system, like a human or a dog would, but they’re perceiving something that’s negative to what they’re trying to achieve with this utility function.

But we do draw a line. And I don’t know that I would use the word ‘magic’ the way you’re doing it. We draw this line by saying that dogs feel pain, so we outlaw animal cruelty. Bacteria don’t, so we don’t outlaw antibiotics. There is a material difference between those two things.

So if the difference is a central nervous system, and pain is being defined as a nervous response to some outside stimuli… then unless we explicitly design machines to have central nervous systems, then I don’t think they will ever experience pain.

Thanks for indulging me in all of that, because I think it matters… Because up until thirty years ago, veterinarians typically didn’t use anesthetic. They were told that animals couldn’t feel pain. Babies were operated on in the ‘90s—open heart surgery—under the theory they couldn’t feel pain.  

What really intrigues me is the idea of how would we know if a machine did? That’s what I’m trying to deconstruct. But enough of that. We’ll talk about jobs here in a minute, and those concerns…

There’s groups of people that are legitimately afraid of AI. You know all the names. You’ve got Elon Musk, you get Stephen Hawking. Bill Gates has thrown in his hat with that, Wozniak has. Nick Bostrom wrote a book that addressed existential threat and all of that. Then you have Mark Zuckerberg, who says no, no, no. You get Oren Etzioni over at the Allen Institute, just working on some very basic problem. You get Andrew Ng with his “overpopulation on Mars. This is not helpful to even have this conversation.”

What is different about those two groups in your mind? What is the difference in how they view the world that gives them these incredibly different viewpoints?

I think it goes down to a definition problem. As you mentioned at the beginning of this podcast, when you ask people, “What is artificial intelligence?” everybody gives you a different answer. I think each one of these experts would also give you a different answer.

If you define artificial intelligence as matrix multiplication and gradient descent in a deep learning system, trying to achieve a very specific classification output given some pixel input—or something like that—it’s very difficult to conceive that as some sort of existential threat for humanity.

But if you define artificial intelligence as this general intelligence, this kind of emergent singularity where the machines don’t hit the plateau, that they continue to advance well beyond humans… maybe to the point where they don’t need humans, or we become the ants in that system… that becomes very rapidly a very existential threat.

As I said before, I don’t think there’s an incremental improvement from algorithms—as they exist in the academic literature today—to that singularity, but I think it can be a slippery slope. And I think that’s what a lot of these experts are talking about… Where if it does become this dynamic system that feeds on itself, by the time we realize it’s happening, it’ll be too late.

Whether or not that’s because of the algorithms that we have today, or algorithms down the line, it does make sense to start having conversations about that, just because of the time scales over which governments and policies tend to work. But I don’t think someone is going to design a TensorFlow or MXNet algorithm tomorrow that’s going to take over the world.

There’s legislation in Europe to basically say, if an AI makes a decision about whether you should get an auto loan or something, you deserve to know why it turned you down. Is that a legitimate request, or is it like you go to somebody at Google and say, “Why is this site ranked number one and this site ranked number two?” There’s no way to know at this point.  

Or is that something that, with the auto loan thing, you’re like, “Nope, here are the big bullet points of what went into it.” And if that becomes the norm, does that slow down AI in any way?

I think it’s important to make sure, just from a societal standpoint, that we continue to strive towards not being discriminatory towards specific groups and people. It can be very difficult, when you have something that looks like a black box from the outside, to be able to say, “Okay, was this being fair?” based off of the fairness that we as a society have agreed upon.

The machine doesn’t have that context. The machine doesn’t have the policy, necessarily, inside to make sure that it’s being as fair as possible. We need to make sure that we do put these constraints on these systems, so that it meets what we’ve agreed upon as a society, in laws, etc., to adhere to. And that it should be held to the same standard as if there was a human making that same decision.

There is, of course, a lot of legitimate fear wrapped up about the effect of automation and artificial intelligence on employment. And just to set the problem up for the listeners, there’s broadly three camps, everybody intuitively knows this.

 There’s one group that says, “We’re going to advance our technology to the point that there will be a group of people who do not have the educational skills needed to compete with the machines, and we’ll have a permanent underclass of people who are unemployable.” It would be like the Great Depression never goes away.

And then there are people who say, “Oh, no, no, no. You don’t understand. Everything, every job, a machine is going to be able to do.” You’ll reach a point where the machine will learn it faster than the human, and that’s it.

And then you’ve got a third group that says, “No, that’s all ridiculous. We’ve had technology come along, as transformative as it is… We’ve had electricity, and machines replacing animals… and we’ve always maintained full employment.” Because people just learn how to use these tools to increase their own productivity, maintain full employment—and we have growing wages.

So, which of those, or a fourth one, do you identify with?

This might be an unsatisfying answer, but I think we’re going to go through all three phases. I think we’re in the third camp right now, where people are learning new systems, and it’s happening at a pace where people can go to a computer science boot camp and become an engineer, and try to retrain and learn some of these systems, and adapt to this changing scenario.

I think, very rapidly—especially at the exponential pace that technology tends to evolve—it does become very difficult. Fifty years ago, if you wanted to take apart your telephone and try to figure out how it works, repair it, that was something that a kid could do at a camp kind of thing, like an entry circuits camp. That’s impossible to do with an iPhone.

I think that’s going to continue to happen with some of these more advanced systems, and you’re going to need to spend your entire life understanding some subcomponent of it. And then, in the further future, as we move towards this direction of artificial general intelligence… Like, once a machine is a thousand times, ten thousand times, one hundred thousand times smarter—by whatever definition—than a human, and that increases at an exponential pace… We won’t need a lot of different things.

Whether or not that’s a fundamentally bad thing is up for debate. I think one thing that’s different about this than the Industrial Revolution, or the agricultural revolution, or things like that, that have happened throughout human history… is that instead of this happening over the course of generations or decades… Maybe if your father, and your grandfather, and your entire family tree did a specific job, but then that job doesn’t exist anymore, you train yourself to do something different.

Once it starts to happen over the course of a decade, or a year, or a month, it becomes much harder to completely retrain. That being said, there’s lots of thoughts about whether or not humans need to be working to be happy. And whether or not there could be some other fundamental thing that would increase the net happiness and fulfillment of people in the world, besides sitting at a desk for forty hours a week.

And maybe that’s actually a good thing, if we can set up the societal constructs to allow people to do that in a healthy and happy way.

Do you have any thoughts on computers displaying emotions, emulating emotions? Is that going to be a space where people are going to want authentic human experiences in those in the future? Or are we like, “No, look at how people talk to their dog,” or something? If it’s good enough to fool you, you just go along with the conceit?

The great thing about computers, and artificial intelligence systems, and things like that is if you point them towards a specific target, they’ll get pretty good at hitting that target. So if the goal is to mimic human emotion, I think that that’s something that’s achievable. Whether or not a human cares, or is even able to distinguish between that and actual human emotion, could be very difficult.

At Cornell, where I did my PhD, they had this psychology chatbot called ELIZA—I think this was back in the ‘70s. It went through a specific school of psychological behavioral therapy thought, replied with specific ways, and people found it incredibly helpful.

Even if they knew that it was just a machine responding to them, it was a way for them to get out their emotions and work through specific problems. As these machines get more sophisticated and able, as long as it’s providing utility to the end user, does it matter who’s behind the screen?

That’s a big question. Weizenbaum shut down ELIZA because he said that when a machine says, “I understand” that it’s a lie, there’s no ‘I’, and there’s nothing [there] that understands anything. He had real issues with that.

But then when they shut it down, some of the end users were upset, because they were still getting quite a bit of utility out of it. There’s this moral question of whether or not you can take away something from someone who is deriving benefit from it as well.

So I guess the concern is that maybe we reach a day where an AI best friend is better than a real one. An AI one doesn’t stand you up. And an AI spouse is better than a human spouse, because of all of those reasons. Is that a better world, or is it not?

I think it becomes a much more dangerous world, because as you said before, someone could decide to turn off the machine. When it’s someone taking away your psychologist, that could be very dangerous. When it’s someone deciding that you didn’t pay your monthly fee, so they’re going to turn off your spouse, that could be quite a bit worse as well.

As you mentioned before, people don’t necessarily associate the feelings or pain or anything like that with the machine, but as these get more and more life-like, and as they are designed with the reward function of becoming more and more human-like, I think that distinction is going to become quite a bit harder for us to understand.

And it not only affects the machine—which you can make the argument doesn’t have a voice—but it’ll start to affect the people as well.

One more question along these lines. You were a Forbes 30 Under 30. You’re fine with computer emotions, and you have this set of views. Do you notice any generational difference between researchers who have been in it longer than you, and people of your age and training? Do you look at it, as a whole, differently than another generation might have?

I think there are always going to be generational differences. People grow up in different times and contexts, societal norms shift… I would argue usually for the better, but not always. So I think that that context in which you were raised, that initial training data that you apply your transfer learning to for the rest of your life, has a huge effect on what you’re actually going to do, and how you perceive the world moving forward.

I spent a good amount of time today at SigOpt. Can you tell me what you’re trying to do there, and why you started or co-founded it, and what the mission is? Give me that whole story.

Yeah, definitely. SigOpt is an optimization-as-a-service company, or a software-as-a-service offering. What we do is help people configure these complex systems. So when you’re building a neural network—or maybe it’s a reinforcement learning system, or an algorithmic trading strategy—there’s often many different tunable configuration parameters.

These are the settings that you need to put in place before the system itself starts to do any sort of learning: things like the depth of the neural network, the learning rates, some of these stochastic gradient descent parameters, etc.

These are often kind of nuisance parameters that are brushed under the rug. They’re typically solved via relatively simplistic methods like brute forcing it or trying random configurations. What we do is we take an ensemble of the state-of-the-art research from academia, and Bayesian and global optimization, and we ensemble all of these algorithms behind a simple API.

So when you are downloading MxNet, or TensorFlow, or Caffe2, whatever it is, you don’t have to waste a bunch of time trying different things via trial-and-error. We can guide you to the best solution quite a bit faster.

Do you have any success stories that you like to talk about?

Yeah, definitely. One of our customers is Hotwire. They’re using us to do things like ranking systems. We work with a variety of different algorithmic trading firms to make their strategies more efficient. We also have this great academic program where SigOpt is free for any academic at any university or national lab anywhere in the world.

So we’re helping accelerate the flywheel of science by allowing people to spend less time doing trial-and-error. I wasted way too much of my PhD on this, to be completely honest—fine-tuning different configuration settings and bioinformatics algorithms.

So our goal is… If we can have humans do what they’re really good at, which is creativity—understanding the context in the domain of a problem—and then we can make the trial-and-error component as little as possible, hopefully, everything happens a little bit faster and a little bit better and more efficiently.

What are the big challenges you’re facing?

Where this system makes the biggest difference is in large complex systems, where it’s very difficult to manually tune, or brute force this problem. Humans tend to be pretty bad at doing 20-dimensional optimization in their head. But a surprising number of people still take that approach, because they’re unable to access some of this incredible research that’s been going on in academia for the last several decades.

Our goal is to make that as easy as possible. One of our challenges is finding people with these interesting complex problems. I think the recent surge of interest in deep learning and reinforcement learning, and the complexity that’s being imbued in a lot of these systems, is extremely good for us, and we’re able to ride that wave and help these people realize the potential of these systems quite a bit faster than they would otherwise.

But having the market come to us is something that we’re really excited about, but it’s not instant.

Do you find that people come to you and say, “Hey, we have this dataset, and we think somewhere in here we can figure out whatever”? Or do they just say, “We have this data, what can we do with it?” Or do they come to you and say, “We’ve heard about this AI thing, and want to know what we can do”?

There are companies that help solve that particular problem, where they’re given raw data and they help you build a model and apply it to some business context. Where SigOpt sits, which is slightly different than that, is when people come to us, they have something in place. They already have data scientists or machine learning engineers.

They’ve already applied their domain expertise to really understand their customers, the business problem they’re trying to solve, everything like that. And what they’re looking for is to get the most out of these systems that they’ve built. Or they want to build a more advanced system as rapidly as possible.

And so SigOpt bolts on top of these pre-existing systems, and gives them that boost by fine-tuning all of these different configuration parameters to get to their maximal performance. So, sometimes we do meet people like that, and we pass them on to some of our great partners. When someone has a problem and they just want to get the most out of it, that’s where we can come in and provide this black box optimization on top of it.

Final question-and-a-half. Do you speak a lot? Do you tweet? If people want to follow you and keep up with what you’re doing, what’s the best way to do that?

They can follow @SigOpt on Twitter. We have a blog where we post technical and high-level blog posts about optimization and some of the different advancements, and deep learning and reinforcement learning. We publish papers, but blog.sigopt.com and on Twitter @SigOpt is the best way to follow us along.

Alright. It has been an incredibly fascinating hour, and I want to thank you for taking the time.

Excellent. Thank you for having me. I’m really honored to be on the show.

Byron explores issues around artificial intelligence and conscious computers in his upcoming book The Fourth Age, to be published in April by Atria, an imprint of Simon & Schuster. Pre-order a copy here