In this episode, Byron and Tasha talk about speech recognition, AGI, consciousness, Droice Lab, healthcare, and science fiction.
Tasha Nagamine is finishing her Ph.D. in Electrical Engineering at Columbia University, where she works on understanding deep learning methods. She also works with Microsoft Research on applying cutting edge artificial intelligence methods to speech research. In the past she has worked with several MDs in hospitals, which makes her knowledgeable about the impact of technology in healthcare.
Byron Reese: This is Voices in AI, brought to you by Gigaom. I’m Byron Reese. Today our guest is Tasha Nagamine. She’s a PhD student at Columbia University, she holds an undergraduate degree from Brown and a Masters in Electrical Engineering from Columbia. Her research is in neural net processing in speech and language, then the potential applications of speech processing systems through, here’s the interesting part, biologically-inspired, deep neural network models. As if that weren’t enough to fill up a day, Tasha is also the CTO of Droice Labs, an AI healthcare company, which I’m sure we will chat about in a few minutes. Welcome to the show, Tasha.
Tasha Nagamine: Hi.
So, your specialty, it looks like, coming all the way up, is electrical engineering. How do you now find yourself in something which is often regarded as a computer science discipline, which is artificial intelligence and speech recognition?
Yeah, so it’s actually a bit of an interesting meandering journey, how I got here. My undergrad specialty was actually in physics, and when I decided to go to grad school, I was very interested, you know, I took a class and found myself very interested in neuroscience.
So, when I joined Columbia, the reason I’m actually in the electrical engineering department is that my advisor is an EE, but what my research and what my lab focuses on is really in neuroscience and computational neuroscience, as well as neural networks and machine learning. So, in that way, I think what we do is very cross-disciplinary, so that’s why the exact department, I guess, may be a bit misleading.
One of my best friends in college was a EE, and he said that every time he went over to like his grandmother’s house, she would try to get him to fix like the ceiling fan or something. Have you ever had anybody assume you’re proficient with a screwdriver as well?
Yes, that actually happens to me quite frequently. I think I had one of my friends’ landlords one time, when I said I was doing electrical engineering, thought that that actually meant electrician, so was asking me if I knew how to fix light bulbs and things like that.
Well, let’s start now talking about your research, if you would. In your introduction, I stressed biologically-inspired deep neural networks. What do you think, do we study the brain and try to do what it does in machines, or are we inspired by it, or do we figure out what the brain’s doing and do something completely different? Like, why do you emphasize “biologically-inspired” DNNs?
That’s actually a good question, and I think the answer to that is that, you know, researchers and people doing machine learning all over the world actually do all of those things. So, the reason that I was stressing a biologically-inspired—well, you could argue that, first of all, all neural networks are in some way biologically-inspired; now, whether or not they are a good biologically-inspired model, is another question altogether—I think a lot of the big, sort of, advancements that come, like a convolutional neural network was modeled basically directly off of the visual system.
That being said, despite the fact that there are a lot of these biological inspirations, or sources of inspiration, for these models, there’s many ways in which these models actually fail to live up to the way that our brains actually work. So, by saying biologically-inspired, I really just mean a different kind of take on a neural network where we try to, basically, find something wrong with a network that, you know, perhaps a human can do a little bit more intelligently, and try to bring this into the artificial neural network.
Specifically, one issue with current neural networks is that, usually, unless you keep training them, they have no way to really change themselves, or adapt to new situations, but that’s not what happens with humans, right? We continuously take inputs, we learn, and we don’t even need supervised labels to do so. So one of the things that I was trying to do was to try to draw from this inspiration, to find a way to kind of learn in an unsupervised way, to improve your performance in a speech recognition task.
So just a minute ago, when you and I were chatting before we started recording, a siren came by where you are, and the interesting thing is, I could still understand everything you were saying, even though that siren was, arguably, as loud as you were. What’s going on there, am I subtracting out the siren? How do I still understand you? I ask this for the obvious reason that computers seem to really struggle with that, right?
Right, yeah. And actually how this works in the brain is a very open question and people don’t really know how it’s done. This is actually an active research area of some of my colleagues, and there’s a lot of different models that people have for how this works. And you know, it could be that there’s some sort of filter in your brain that, basically, sorts speech from the noise, for example, or a relevant signal from an irrelevant one. But how this happens, and exactly where this happens is pretty unknown.
But you’re right, that’s an interesting point you make, is that machines have a lot of trouble with this. And so that’s one of the inspirations behind these types of research. Because, currently, in machine learning, we don’t really know the best way to do this and so we tend to rely on large amounts of data, and large amounts of labeled data or parallel data, data corrupted with noise intentionally, however this is definitely not how our brain is doing it, but how that’s happening, I don’t think anyone really knows.
Let me ask you a different question along the same lines. I read these stories all the time that say that, “AI has approached human-quality in transcribing speech,” so I see that. And then I call my airline of choice, I will not name them, and it says, “What is your frequent flyer number?” You know, it’s got Caller ID, it should know that, but anyway. Mine, unfortunately, has an A, an H, and an 8 in it, so you can just imagine “AH8H888H”, right?
It never gets it. So, I have to get up, turn the fan off in my office, take my headset off, hold the phone out, and say it over and over again. So, two questions: what’s the disconnect between what I read and my daily experience? Actually, I’ll give you that question and then I have my follow up in a moment.
Oh, sure, so you’re saying, are you asking why it can’t recognize your—
But I still read these stories that say it can do as good of a job as a human.
Well, so usually—and, for example, I think, recently, there was a story published about Microsoft coming up with a system that had reached human parity in speech recognition—well, usually when you’re saying that, you have it on a somewhat artificial task. So, you’ll have a predefined data set, and then test the machine against humans, but that doesn’t necessarily correspond to a real-world setting, they’re not really doing speech recognition out in the wild.
And, I think, you have an even more difficult problem, because although it’s only frequent flyer numbers, you know, there’s no language model there, there’s no context for what your next number should be, so it’s very hard for that kind of system to self-correct, which is a bit problematic.
So I’m hearing two things. The first thing, it sounds like you’re saying, they’re all cooking the books, as it were. The story is saying something that I interpret one way that isn’t real, if you dig down deep, it’s different. But the other thing you seem to be saying is, even though there’s only thirty-six things I could be saying, because there’s no natural flow to that language, it can’t say, “oh, the first word he said was ‘the’ and the third word was ‘ran;’ was that middle word ‘boy’ or ‘toy’?” It could say, “Well, toys don’t run, but boys do, therefore it must be, ‘The boy ran.'” Is that what I’m hearing you saying, that a good AI system’s going to look contextually and get clues from the word usage in a way that a frequent flyer system doesn’t.
Right, yeah, exactly. I think this is actually one of the fundamental limitations of, at least, acoustic modeling, or, you know, the acoustic part of speech recognition, which is that you are completely limited by what the person has said. So, you know, maybe it could be that you’re not pronouncing your “t” at the end of “eight,” very emphatically. And the issue is that, there’s nothing you can really do to fix that without some sort of language-based information to fix it.
And then, to answer your first question, I wouldn’t necessarily call it “cooking the books,” but it is a fact that, you know, really the data that you have to train on and test on and to evaluate your metrics on, often, almost never really matches up with real-world data, and this is a huge problem in the speech domain, it’s a very well-known issue.
You take my 8, H, and A example—which you’re saying that’s a really tricky problem without context—and, let’s say, you have one hundred English speakers, but one is from Scotland, and one could be Australian, and one could be from the east coast, one could be from the south of the United States; is it possible that the range of how 8 is said in all those different places is so wide that it overlaps with how H is said in some places. So, in other words, it’s a literally insoluble problem.
It is, I would say it is possible. One of the issues is then you should have a separate model for different dialects. I don’t want to dive too far into the weeds with this, but at the root of a speech recognition system is often things like the fundamental linguistic or phonetic unit is a phoneme, which is the smallest speech sound, and people even argue about whether or not that these actually exist, what they actually mean, whether or not this is a good unit to use when modeling speech.
That being said, there’s a lot of research underway, for example, sequence to sequence models or other types of models that are actually trying to bypass this sort of issue. You know, instead of having all of these separate components modeling all of the acoustics separately, can we go directly from someone’s speech and from there exactly get text. And maybe through this unsupervised approach it’s possible to learn all these different things about dialects, and to try to inherently learn these things, but that is still a very open question, and currently those systems are not quite tractable yet.
I’m only going to ask one more question on these lines—though I could geek out on this stuff all day long, because I think about it a lot—but really quickly, do you think you’re at the very beginning of this field, or do you feel it’s a pretty advanced field? Just the speech recognition part.
Speech recognition, I think we’re nearing the end of speech recognition to be honest. I think that you could say that speech is fundamentally limited; you are limited by the signal that you are provided, and your job is to transcribe that.
Now, where speech recognition stops, that’s where natural language processing begins. As everyone knows, language is infinite, you can do anything with it, any permutation of words, sequences of words. So, I really think that natural language processing is the future of this field, and I know that a lot of people in speech are starting to try to incorporate more advanced language models into their research.
Yeah, that’s a really interesting question. So, I ran an article on Gigaom, where I had an Amazon Alexa device on my desk and I had a Google Assistant on my desk, and what I noticed right away is that they answer questions differently. These were factual questions, like “How many minutes are in a year?” and “Who designed the American flag?” They had different answers. And you can say it’s because of an ambiguity in the language, but if this is an ambiguity, then all language is naturally ambiguous.
So, the minutes in a year answer difference was that one gave you the minutes in 365.24 days, a solar year, and one gave you the minutes in a calendar year. And with regard to the flag, one said Betsy Ross, and one said the person who designed the fifty-star configuration on the current flag.
And so, we’re a long way away from the machines saying, “Well, wait a second, do you mean the current flag or the original flag?” or, “Are you talking about a solar year or a calendar year?” I mean, we’re really far away from that, aren’t we?
Yeah, I think that’s definitely true. You know, people really don’t understand how even humans process language, how we disambiguate different phrases, how we find out what are the relevant questions to ask to disambiguate these things. Obviously, people are working on that, but I think we are quite far from true natural language understanding, but yeah, I think that’s a really, really interesting question.
There were a lot of them, “Who invented the light bulb?” and “How many countries are there in the world?” I mean the list was endless. I didn’t have to look around to find them. It was almost everything I asked, well, not literally, “What’s 2+2?” is obviously different, but there were plenty of examples.
To broaden that question, don’t you think if we were to build an AGI, an artificial general intelligence, an AI as versatile as a human, that’s table stakes, like you have to be able to do that much, right?
Oh, of course. I mean, I think that one of the defining things that makes human intelligence unique, is the ability to understand language and an understanding of grammar and all of this. It’s one of the most fundamental things that makes us human and intelligent. So I think, yeah, to have an artificial general intelligence, it would be completely vital and necessary to be able to do this sort of disambiguation.
Well, let me ratchet it up even another one. There’s a famous thought experiment called the Chinese Room problem. For the benefit of the listener, the setup is that there’s a person in a room who doesn’t speak any Chinese, and the room he’s in is full of this huge number of very specialized books; and people slide messages under the door to him that are written in Chinese. And he has this method where he looks up the first character and finds the book with that on the spine, and goes to the second character and the third and works his way through, until he gets to a book that says, “Write this down.” And he copies these symbols, again, he doesn’t know what the symbols are; he slides the message back out, and the person getting it thinks it’s a perfect Chinese answer, it’s brilliant, it rhymes, it’s great.
So, the thought experiment is this, does the man understand Chinese? And the point of the thought experiment is that this is all a computer does—it runs this deterministic program, and it never understands what it’s talking about. It doesn’t know if it’s about cholera or coffee beans or what have you. So, my question is, for an AGI to exist, does it need to understand the question in a way that’s different than how we’ve been using that word up until now?
That’s a good question. I think that, yeah, to have an artificial general intelligence, I think the computer would have to, in a way, understand the question. Now, that being said, what is the nature of understanding the question? How do we even think, is a question that I don’t think even we know the answer to. So, it’s a little bit difficult to say, exactly, what’s the minimum requirement that you would need for some sort of artificial general intelligence, because as it stands now, I don’t know. Maybe someone smarter than me knows the answer, but I don’t even know if I really understand how I understand things, if that makes sense to you.
So what do you do with that? Do you say, “Well, that’s just par for the course. There’s a lot of things in this universe we don’t understand, but we’re going to figure it out, and then we’ll build an AGI”? Is the question of understanding just a very straightforward scientific question, or is it a metaphysical question that we don’t really even know how to pose or answer?
I mean, I think that this question is a good question, and if we’re going about it the right way, it’s something that remains to be seen. But I think one way that we can try to ensure that we’re not straying off the path, is by going back to these biologically-inspired systems. Because we know that, at the end of the day, our brains are made up of neurons, synapses, connections, and there’s nothing very unique about this, it’s physical matter, there’s no theoretical reason why a computer cannot do the same computations.
So, if we can really understand how our brains are working, what the computations it performs are, how we have consciousness; then I think we can start to get at those questions. Now, that being said, in terms of where neuroscience is today, we really have a very limited idea of how our brains actually work. But I think it’s through this avenue that we stand the highest chance of success of trying to emulate, you know—
Let’s talk about that for a minute, I think that’s a fascinating topic. So, the brain has a hundred billion neurons that somehow come together and do what they do. There’s something called a nematode worm—arguably the most successful animal on the planet, ten percent of all animals on the planet are these little worms—they have I think 302 neurons in their brain. And there’s been an effort underway for twenty years to model that brain—302 neurons—in the computer and make a digitally living nematode worm, and even the people who have worked on that project for twenty years, don’t even know if that’s possible.
What I was hearing you say is, once we figure out what a neuron does—this reductionist view of the brain—we can build artificial neurons, and build a general intelligence, but what if every neuron in your brain has the complexity of a supercomputer? What if they are incredibly complicated things that have things going on at the quantum scale, that we are just so far away from understanding? Is that a tenable hypothesis? And doesn’t that suggest, maybe we should think about intelligence a different way because if a neuron’s as complicated as a supercomputer, we’re never going to get there.
That’s true, I am familiar with that research. So, I think that there’s a couple of ways that you can do this type of study because, for example, trying to model a neuron at the scale of its ion channels and individual connections is one thing, but there are many, many scales upon which your brain or any sort of neural system works.
I think to really get this understanding of how the brain works, it’s great to look at this very microscale, but it also helps to go very macro and instead of modeling every single component, try to, for example, take groups of neurons, and say, “How are they communicating together? How are they communicating with different parts of the brain?” Doing this, for example, is usually how human neuroscience works and humans are the ones with the intelligence. If you can really figure out on a larger scale, to the point where you can simplify some of these computations, and instead of understanding every single spike, perhaps understanding the general behavior or the general computation that’s happening inside the brain, then maybe it will serve to simplify this a little bit.
Where do you come down on all of that? Are we five years, fifty years or five hundred years away from cracking that nut, and really understanding how we understand and understanding how we would build a machine that would understand, all of this nuance? Do you think you’re going to live to see us make that machine?
I would be thrilled if I lived to see that machine, I’m not sure that I will. Exactly saying when this will happen is a bit hard for me to predict, but I know that we would need massive improvements; probably, algorithmically, probably in our hardware as well, because true intelligence is massively computational, and I think it’s going to take a lot of research to get there, but it’s hard to say exactly when that would happen.
Do you keep up with the Human Brain Project, the European initiative to do what you were talking about before, which is to be inspired by human brains and learn everything we can from that and build some kind of a computational equivalent?
A little bit, a little bit.
Do you have any thoughts on—if you were the betting sort—whether that will be successful or not?
I’m not sure if that’s really going to work out that well. Like you said before, given our current hardware, algorithms, our abilities to probe the human brain; I think it’s very difficult to make these very sweeping claims about, “Yes, we will have X amount of understanding about how these systems work,” so I’m not sure if it’s going to be successful in all the ways it’s supposed to be. But I think it’s a really valuable thing to do, whether or not you really achieve the stated goal, if that makes sense.
You mentioned consciousness earlier. So, consciousness, for the listeners, is something people often say we don’t know what it is; we know exactly what it is, we just don’t know how it is that it happens. What it is, is that we experience things, we feel things, we experience qualia—we know what pineapple tastes like.
Do you have any theories on consciousness? Where do you think it comes from, and, I’m really interested in, do we need consciousness in order to solve some of these AI problems that we all are so eager to solve? Do we need something that can experience, as opposed to just sense?
Interesting question. I think that there’s a lot of open research on how consciousness works, what it really means, how it helps us do this type of cognition. So, we know what it is, but how it works or how this would manifest itself in an artificial intelligence system, is really sort of beyond our grasp right now.
I don’t know how much true consciousness a machine needs, because, you could say, for example, that having a type of memory may be part of your consciousness, you know, being aware, learning things, but I don’t think we have yet enough really understanding of how this works to really say for sure.
All right fair enough. One more question and I’ll pull the clock back thirty years and we’ll talk about the here and now; but my last question is, do you think that a computer could ever feel something? Could a computer ever feel pain? You could build a sensor that tells the computer it’s on fire, but could a computer ever feel something, could we build such a machine?
I think that it’s possible. So, like I said before, there’s really no reason why—what our brain does is really a very advanced biological computer—you shouldn’t be able to feel pain. It is a sensation, but it’s really just a transfer of information, so I think that it is possible. Now, that being said, how this would manifest, or what a computer’s reaction would be to pain or what would happen, I’m not sure what that would be, but I think it’s definitely possible.
Fair enough. I mentioned in your introduction that you’re the CTO of an AI company Droice Labs, and the only setup I made was that it was a healthcare company. Tell us a little bit more, what challenge that Droice Labs is trying to solve, and what the hope is, and what your present challenges are and kind of the state of where you’re at?
Sure. Droice is a healthcare company that uses artificial intelligence to help provide artificial intelligence solutions to hospitals and healthcare providers. So, one of the main things that we’re focusing on right now is to try to help doctors choose the right treatment for their patients. This means things like, for example, you come in, maybe you’re sick, you have a cough, you have pneumonia, let’s say, and you need an antibiotic. What we try to do is, when you’re given an antibiotic, we try to predict whether or not this treatment will be effective for you, and also whether or not it’ll have any sort of adverse event on you, so both try to get people healthy, and keep them safe.
And so, this is really what we’re focusing on at the moment, trying to make a sort of artificial brain for healthcare that can, shall we say, augment the intelligence of the doctors and try to make sure that people stay healthy. I think that healthcare’s a really interesting sphere in which to use artificial intelligence because currently the technology is not very widespread because of the difficulty in working with hospital and medical data, so I think it’s a really interesting opportunity.
So, let’s talk about that for a minute, AIs are generally only as good as the data we train them with. Because I know that whenever I have some symptom, I type it into the search engine of choice, and it tells me I have a terminal illness; it just happens all the time. And in reality, of course, whatever that terminal illness is, there is a one-in-five-thousand chance that I have that, and then there’s also a ninety-nine percent chance I have whatever much more common, benign thing. How are you thinking about how you can get enough data so that you can build these statistical models and so forth?
We’re a B2B company, so we have partnerships with around ten hospitals right now, and what we do is get big data dumps from them of actual electronic health records. And so, what we try to do is actually use real patient records, like, millions of patient records that we obtain directly from our hospitals, and that’s how we really are able to get enough data to make these types of predictions.
How accurate does that data need to be? Because it doesn’t have to be perfect, obviously. How accurate does it need to be to be good enough to provide meaningful assistance to the doctor?
That is actually one of the big challenges, especially in this type of space. In healthcare, it’s a bit hard to say which data is good enough, because it’s very, very common. I mean, one of the hallmarks of clinical or medical data is that it will, by default, contain many, many missing values, you never have the full story on any given patient.
Additionally, it’s very common to have things like errors, there’s unstructured text in your medical record that very often contains mistakes or just insane sentence fragments that don’t really make sense to anyone but a doctor, and this is one of the things that we work really hard on, where a lot of times traditional AI methods may fail, but we basically spend a lot of time trying to work with this data in different ways, come up with noise-robust pipelines that can really make this work.
I would love to hear more detail about that, because I’m sure it’s full of things like, “Patient says their eyes water whenever they eat potato chips,” and you know, that’s like a data point, and it’s like, what do you do with that. If that is a big problem, can you tell us what some of the ways around it might be?
Sure. I’m sure you’ve seen a lot of crazy stuff in these health records, but what we try to do is—instead of biasing our models by doing anything in a rule-based manner—we use the fact that we have big data, we have a lot of data points, to try to really come up with robust models, so that, essentially, we don’t really have to worry about all that crazy stuff in there about potato chips and eyes watering.
And so, what we actually end up doing is, basically, we take these many, many millions of individual electronic health records, and try to combine that with outside sources of information, and this is one of the ways that we can try to really augment the data on our health record to make sure that we’re getting the correct insights about it.
So, with your example, you said, “My eyes water when I eat potato chips.” What we end up doing is taking that sort of thing, and in an automatic way, searching sources of public information, for example clinical trials information or published medical literature, and we try to find, for example, clinical trials or papers about the side effects of rubbing your eyes while eating potato chips. Now of course, that’s a ridiculous example, but you know what I mean.
And so, by augmenting this public and private data together, we really try to create this setup where we can get the maximum amount of information out of this messy, difficult to work with data.
The kinds of data you have that are solid data points, would be: how old is the patient, what’s their gender, do they have a fever, do they have aches and pains; that’s very coarse-level stuff. But like—I’m regretting using the potato chip example because now I’m kind of stuck with it—but, a potato chip is made of a potato which is a tuber, which is a nightshade and there may be some breakthrough, like, “That may be the answer, it’s an allergic reaction to nightshades. And that answer is so many levels removed.
I guess what I’m saying is, and you said earlier, language is infinite, but health is near that, too, right? There are so many potential things something could be, and yet, so few data points, that we must try to draw from. It would be like, if I said, “I know a person who is 6’ 4” and twenty-seven years old and born in Chicago, what’s their middle name?” It’s like, how do you even narrow it down to a set of middle names?
Right, right. Okay, I think I understand what you’re saying. This is, obviously, a challenge, but one of the ways that we kind of do this is, the first thing is our artificial intelligence is really intended for doctors and not the patients. Although, we were just talking about AGI and when it will happen, but the reality is we’re not there yet, so while our system tries to make these predictions, it’s under the supervision of a doctor. So, they’re really looking at these predictions and trying to pull out relevant things.
Now, you mentioned, the structured data—this is your age, your weight, maybe your sex, your medications; this is structured—but maybe the important thing is in the text, or is in the unstructured data. So, in this case, one of the things that we try to do, and it’s one of the main focuses of what we do, is to try to use natural language processing, NLP, to really make sure that we’re processing this unstructured data, or this text, in a way to really come up with a very robust, numerical representation of the important things.
So, of course, you can mine this information, this text, to try to understand, for example, you have a patient who has some sort of allergy, and it’s only written in this text, right? In that case, you need a system to really go through this text with a fine-tooth comb, and try to really pull out risk factors for this patient, relevant things about their health and their medical history that may be important.
So, is it not the case that diagnosing—if you just said, here is a person who manifests certain symptoms, and I want to diagnose what they have—may be the hardest problem possible. Especially compared to where we’ve seen success, which is, like, here is a chest x-ray, we have a very binary question to ask: does this person have a tumor or do they not? Where the data is: here’s ten thousand scans with the tumor, here’s a hundred thousand without a tumor.
Like, is it the cold or the flu? That would be an AI kind of thing because an expert system could do that. I’m kind of curious, tell me what you think—and then I’d love to ask, what would an ideal world look like, what would we do to collect data in an ideal world—but just with the here and now, aspirationally, what do you think is as much as we can hope for? Is it something, like, the model produces sixty-four things that this patient may have, rank ordered, like a search engine would do from the most likely to the least likely, and the doctor can kind of skim down it and look for something that catches his or her eye. Is that as far as we can go right now? Or, what do you think, in terms of general diagnosing of ailments?
Sure, well, actually, what we focus on currently is really on the treatment, not on the diagnosis. I think the diagnosis is a more difficult problem, and, of course, we really want to get into that in the future, but that is actually somewhat more of a very challenging sort of thing to do.
That being said, what you mentioned, you know, saying, “Here’s a list of things, let’s make some predictions of it,” is actually a thing that we currently do in terms of treatments for patients. So, one example of a thing that we’ve done is built a system that can predict surgical complications for patients. So, imagine, you have a patient that is sixty years old and is mildly septic, and may need some sort of procedure. What we can do is find that there may be a couple alternative procedures that can be given, or a nonsurgical intervention that can help them manage their condition. So, what we can do is predict what will happen with each of these different treatments, what is the likelihood it will be successful, as well as weighing this against their risk options.
And in this way, we can really help the doctor choose what sort of treatment that they should give this person, and it gives them some sort of actionable insight, that can help them get their patients healthy. Of course, in the future, I think it would be amazing to have some sort of end to end system that, you know, a patient comes in, and you can just get all the information and it can diagnose them, treat them, get them better, but we’re definitely nowhere near that yet.
Recently, IBM made news that Watson had prescribed treatment for cancer patients that was largely identical to what the doctors did, but it had the added benefit that in a third of the cases it found additional treatment options, because it had virtue of being trained on a quarter million medical journals. Is that the kind of thing that’s like “real, here, today,” that we will expect to see more things like that?
I see. Yeah, that’s definitely a very exciting thing, and I think that’s great to see. One of the things that’s very interesting, is that IBM primarily works on cancer. It’s lacking in these high prescription volume sorts of conditions, like heart disease or diabetes. So, I think that while this is very exciting, this is definitely a sort of technology, and a space for artificial intelligence, where it really needs to be expanded, and there’s a lot of room to grow.
So, we can sequence a genome for $1,000. How far away are we from having enough of that data that we get really good insights into, for example, a person has this combination of genetic markers, and therefore this is more likely to work or not work. I know that in isolated cases we can do that, but when will we see that become just kind of how we do things on a day-to-day basis?
I would say, probably, twenty-five years from the clinic. I mean, it’s great, this information is really interesting, and we can do it, but it’s not widely used. I think there are too many regulations in place right now that keep this from happening, so, I think it’s going to be, like I said, maybe twenty-five years before we really see this very widely used for a good number of patients.
So are there initiatives underway that you think merit support that will allow this information to be collected and used in ways that promote the greater good, and simultaneously, protect the privacy of the patients? How can we start collecting better data?
Yeah, there are a lot of people that are working on this type of thing. For example, Obama had a precision medicine initiative and these types of things where you’re really trying to, basically, get your health records and your genomic data, and everything consolidated and have a very easy flow of information so that doctors can easily integrate information from many sources, and have very complete patient profiles. So, this is a thing that’s currently underway.
To pull out a little bit and look at the larger world, you’re obviously deeply involved in speech, and language processing, and health care, and all of these areas where we’ve seen lots of advances happening on a regular basis, and it’s very exciting. But then there’s a lot of concern from people who have two big worries. One is the effect that all of this technology is going to have on employment. And there’s two views.
One is that technology increases productivity, which increases wages, and that’s what’s happened for two hundred years, or, this technology is somehow different, it replaces people and anything a person can do eventually the technology will do better. Which of those camps, or a third camp, do you fall into? What is your prognosis for the future of work?
Right. I think that technology is a good thing. I know a lot of people have concerns, for example, that if there’s too much artificial intelligence it will replace my job, there won’t be room for me and for what I do, but I think that what’s actually going to happen, is we’re just going to see, shall we say, a shifting employment landscape.
Maybe if we have some sort of general intelligence, then people can start worrying, but, right now, what we’re really doing through artificial intelligence is augmenting human intelligence. So, although some jobs become obsolete, now to maintain these systems, build these systems, I believe that you actually have, now, more opportunities there.
For example, ten to fifteen years ago, there wasn’t such a demand for people with software engineering skills, and now it’s almost becoming something that you’re expected to know, or, like, the internet thirty years back. So, I really think that this is going to be a good thing for society. It may be hard for people who don’t have any sort of computer skills, but I think going forward, that these are going to be much more important.
Do you consume science fiction? Do you watch movies, or read books, or television, and if so, are there science fiction universes that you look at and think, “That’s kind of how I see the future unfolding”?
Have you ever seen the TV show Black Mirror?
Well, yeah that’s dystopian though, you were just saying things are going to be good. I thought you were just saying jobs are good, we’re all good, technology is good. Black Mirror is like dark, black, mirrorish.
Yeah, no, I’m not saying that’s what’s going to happen, but I think that’s presenting the evil side of what can happen. I don’t think that’s necessarily realistic, but I think that show actually does a very good job of portraying the way that technology could really be integrated into our lives. Without all of the dystopian, depressing stories, I think that the way that it shows the technology being integrated into people’s lives, how it affects the way people live—I think it does a very good job of doing things like that.
I wonder though, science fiction movies and TV are notoriously dystopian, because there’s more drama in that than utopian. So, it’s not conspiratorial or anything, I’m not asserting that, but I do think that what it does, perhaps, is causes people—somebody termed it “generalizing from fictional evidence,” that you see enough views of the future like that, you think, “Oh, that’s how it’s going to happen.” And then that therefore becomes self-fulfilling.
Frank Herbert, I think, it was who said, “Sometimes the purpose of science fiction is to keep a world from happening.” So do you think those kinds of views of the world are good, or do you think that they increase this collective worry about technology and losing our humanity, becoming a world that’s blackish and mirrorish, you know?
Right. No, I understand your point and actually, I agree. I think there is a lot of fear, which is quite unwarranted. There is actually a lot more transparency in AI now, so I think that a lot of those fears are just, well, given the media today, as I’m sure we’re all aware, it’s a lot of fear mongering. I think that these fears are really something that—not to say there will be no negative impact—but, I think, every cloud has its silver lining. I think that this is not something that anyone really needs to be worrying about. One thing that I think is really important is to have more education for a general audience, because I think part of the fear comes from not really understanding what AI is, what it does, how it works.
Right, and so, I was just kind of thinking through what you were saying, there’s an initiative in Europe that, AI engines—kind of like the one you’re talking about that’s suggesting things—need to be transparent, in the sense they need to be able to explain why they’re making that suggestion.
But, I read one of your papers on deep neural nets, and it talks about how the results are hard to understand, if not impossible to understand. Which side of that do you come down on? Should we limit the technology to things that can be explained in bulleted points, or do we say, “No, the data is the data and we’re never going to understand it once it starts combining in these ways, and we just need to be okay with that”?
Right, so, one of the most overused phrases in all of AI is that “neural networks are a black box.” I’m sure we’re all sick of hearing that sentence, but it’s kind of true. I think that’s why I was interested in researching this topic. I think, as you were saying before, the why in AI is very, very important.
So, I think, of course we can benefit from AI without knowing. We can continue to use it like a black box, it’ll still be useful, it’ll still be important. But I think it will be far more impactful if you are able to explain why, and to really demystify what’s happening.
One good example from my own company is that in medicine it’s vital for the doctor to know why you’re saying what you’re saying, at Droice. So, if a patient comes in and you say, “I think this person is going to have a very negative reaction to this medicine,” it’s very vital for us to try to analyze the neural network and explain, “Okay, it’s really this feature of this person’s health record, for example, the fact that they’re quite old and on another medication.” That really makes them trust the system, and really eases the adoption, and allows them to integrate into traditionally less technologically focused fields.
So, I think that there’s a lot of research now that’s going into the why in AI, and it’s one of my focuses of research, and I know the field has really been blooming in the last couple of years, because I think people are realizing that this is extremely important and will help us not only make artificial intelligence more translational, but also help us to make better models.
You know, in The Empire Strikes Back, when Luke is training on Dagobah with Yoda, he asked him, “Why, why…” and Yoda was like, “There is no why.” Do you think there are situations where there is no why? There is no explainable reason why it chose what it did?
Well, I think there is always a reason. For example, you like ice cream; well, maybe it’s a silly reason, but the reason is that it tastes good. It might not be, you know, you like pistachio better than caramel flavor—so, let’s just say the reason may not be logical, but there is a reason, right? It’s because it activates the pleasure center in your brain when you eat it. So, I think that if you’re looking for interpretability, in some cases it could be limited but I think there’s always something that you could answer when asking why.
Alright. Well, this has been fascinating. If people want to follow you, keep up with what you’re doing, keep up with Droice, can you just run through the litany of ways to do that?
Yeah, so we have a Twitter account, it’s “DroiceLabs,” and that’s mostly where we post. And we also have a website: www.droicelabs.com, and that’s where we post most of the updates that we have.
Alright. Well, it has been a wonderful and far ranging hour, and I just want to thank you so much for being on the show.
Thank you so much for having me.