On this episode Byron has a conversation with Hilary Mason, an acclaimed data and research scientist, about the mechanics and philosophy behind designing and building AI.
Hilary Mason is an American data scientist and the founder of technology startup Fast Forward Labs as well as Data Scientist in Residence at Accel Partners. She was the Chief Scientist at bitly. On September 7, 2017, Cloudera announced that it had acquired Fast Forward Labs, and that Mason would become Cloudera's Vice President of Research.
Byron Reese: This is Voices in AI, brought to you by Gigaom and I am Byron Reese. Today, our guest is Hilary Mason. She is the GM of Machine Learning at Cloudera, and the founder and CEO of Fast Forward Labs, and the Data Scientist in residence at Accel Partners, and a member of the Board of Directors at the Anita Borg Institute for Women in Technology, and the co-founder of hackNY.org. That’s as far down as it would let me read in her LinkedIn profile, but I’ve a feeling if I’d clicked that ‘More’ button, there would be a lot more.
Welcome to the show, amazing Hilary Mason!
Hilary Mason: Thank you very much. Thank you for having me.
I always like to start with the question I ask everybody because I’ve never had the same answer twice and – I’m going to change it up: why is it so hard to define what intelligence is? And are we going to build computers that actually are intelligent, or they can only emulate intelligence, or are those two things the exact same thing?
This a fun way to get started! I think it’s difficult to define intelligence because it’s not always clear what we want out of the definition. Are we looking for something that distinguishes human intelligence from other forms of intelligence? There’s that joke that’s kind of a little bit too true that goes around in the community that AI, or artificial intelligence, is whatever computers can’t do today. Where we keep moving the bar, just so that we can feel like there’s something that is still uniquely within the bounds of human thought.
Let’s move to the second part of your discussion which is really asking, ‘Can computers ever be indistinguishable from human thought?’ I think it’s really useful to put a timeframe on that thought experiment and to say that in the short term, ‘no.’ I do love science fiction, though, and I do believe that it is worth dreaming about and working towards a world in which we could create intelligences that are indistinguishable from human intelligences. Though I actually, personally, think that it is more likely we will build computational systems to augment and extend human intelligence. For example, I don’t know about you but my memory is horrible. I’m routinely absentminded. I do use technology to augment my capabilities there, and I would love to have it more integrated into my own self and my intelligence.
Yeah, did you know ancient people, not even that far back, like Roman times, had vastly better memories than we had? We know of one Roman general that knew the names of all 25,000 of his troops and the names of all their families. Yet, Plato wasn’t a big fan of writing for that very reason. He said that with writing, you’ve invented a system for reminding yourself but not for remembering anything. He predicted that once literacy was widespread, our memories would go to pot, and he was right. Like you, I can’t remember my PIN# half the time!
I guess my real question, though, is when you ask people – “well, when will we have a general intelligence?” you have a range of answers. You have five years for—Elon Musk used that timeline and then to 500. Andrew Ng is worrying about such things as overpopulation on Mars. The reason the range is so high is nobody knows how to build a general intelligence. Would you agree with that?
Yes, I would agree, and I would firmly state that I do not believe there is a technical path from where we are today to that form of general intelligence.
You know that’s a fantastic observation because machine learning, our trick du jour, is an idea that says: ‘let’s take information about the past, study it, look for patterns, and project them into the future.’ That may not be a path to general intelligence. Is that what you’re saying?
That is what I’m saying. That we know how to build systems that look at data and make predictions or forecasts that infer things that we can’t even directly observe, which is remarkable. We do not know how to make systems that mimic intelligence in ways that would distinguish it from the systems or from humans.
I’ve had 100 guests on this show – and they virtually all believe we could/can, with your caveat about the timeframe, create a general intelligence, even though they all agree we don’t know how to do it. The reason those two things are compatible is they have a simple assumption that is: humans are machines, specifically our brains are machines. You know how the thought experiment goes... if you could take what a neuron did and model that and then did that a hundred billion times and figured out what the glial cells do and all that other stuff, there’s no reason you can’t build a general intelligence.
Do you believe people are machines, or our brains are purely mechanistic in the sense that there’s nothing about them that cannot be described with physics, really?
So I do believe that, with the caveat that we don’t necessarily understand all of that physics, necessarily today. I do think there is a biological and physical basis for human intelligence, and that should we understand it well enough, we could possibly construct something that’s indistinguishable. But we certainly don’t understand it and we may need to invent entire new fields of physics before we would.
Of the 100 guests on the show, 95 of them, plus or minus 1, had that viewpoint: that there’s a mechanistic base for our intelligence, therefore logically we will build it. When I put that same question on my website and asked the general public, only 15% of people agreed with the statement.
I’ll ask two questions. Do you think that disconnect is material? And secondly, is there an argument that we cannot build a general intelligence that does not rely on anything spiritual?
How sure are you that the people clicking that button on your website are human?
(Laughter) I ask this question [when] I give speeches to a room full of people. When I wrote my last book about artificial intelligence, and I wrote that the basis for believing a general intelligence is the mechanism of the human brain, my editor wrote in the column, “Come on! Does anybody really believe that?” That idea is preposterous to the vast majority of the world. I think what they don’t understand is that that idea is foundational to the people working on artificial intelligence. That’s the basis for believing we can make general intelligence.
Yes, and I think it is something that you’ll see repeated throughout technical development. That once it’s known that something is achievable, that it can exist, many more people will figure out how to make it exist.
Well that’s a big thought right there. I like that. Okay, well you’ve indulged me – I’ll ask my parting question. We have these brains we don’t understand how they work. We don’t even understand how a nematode brain works. They studied that for 20 years in the OpenWorm project. Nematode worms pack in 302 neurons in its “brain,” and we can’t even model that.
Then we don’t know how the mind works and the mind is all of this. Your liver doesn’t have a sense of humor but somehow you do. How is that? No cell has a sense of humor. You do. You have these emergent – and we don’t understand how the mind comes about. Most interestingly, we experience the world; we’re conscious. We can feel warmth, whereas a computer can only measure temperature. I always have found it a stretch to say, ‘yeah, we don’t know how the brain works, we don’t know how the mind emerges, and we don’t know even how to ask the question of consciousness scientifically, but we know we can build it, someday.’ Does that not strike you as a disconnect?
Not at all! It strikes me as the core of human ambition that leads us to inquiry. We’re not saying we know how to build it tomorrow, but rather that because we see it exist in the world, we think we can build it too. I think that’s actually quite hopeful.
I try to keep track of everybody working [on it] because 99% of the money coming into AI with stuff like what you do. When I try to count the people that are working on general intelligence, I get about 10: Open AI, Carnegie Mellon, Oren Etzioni... You can start naming them. Do you believe that, that very few people are actually working on general intelligence?
Well, I think that is more of a branding question than it is a fundamental technical question in the sense that, again, there is no path from where we stand today, to general intelligence. People take varying approaches to improve systems. Some do it with that goal of trying to emulate human cognition and others do it with a goal of just building something that happens to work really well and be super useful. I actually don’t think, if you look at the math, that what is happening in some of these groups is all that different, though of course, I’m not in those groups so I don’t know. I think it’s more about the stories people tell about the goals of their work and why they’re doing what they’re doing.
Fair enough. I just wondered if your perspective was different. You’re one of the minority, not small minority, but minority of people who don’t actually think – anybody who’s doing faster machine learning probably isn’t actually working on that problem.
To switch gears, I promised that would be the last one of those questions. Let’s go to the world of ‘here and now.’ Tell me what you – well, you do all these things. Let’s start with Cloudera because you’re a practical person. Tell me what Cloudera is and what you do there and the kinds of challenges that you face over your morning coffee.
Okay, and I am a big fan of morning coffee! I’m the General Manager of Cloudera’s machine-learning business, which is a broad umbrella over everything probabilistic that our customers want to do, with structured or unstructured data. If you’re talking about AI, if you’re talking about machine learning, if you’re talking about data science, that’s all under our purview. We have a bunch of capabilities there.
Our customers tend to be large enterprises, many of the largest companies in the world. They’re often in highly regulated industries or they work in spaces like insurance, telecom, finance, manufacturing. They have problems, like supply chain type problems. They have customer happiness problems. Here, I should be clear, I’m using “problem” in the computer science sense, and I’ve been told that in public I should call them ‘opportunities.’ I mean exciting things and capabilities that we can explore. We do a few different things.
We have a software platform for people to run data science workloads, all the way from data engineering through data science experimentation, out to production. We have a set of consulting services that help our customers figure out what a great data strategy should look like and how to execute against it effectively. We also help build applications from time to time. We have an applied machine-learning research team that looks 6 months to 24 months ahead in terms of technical capabilities that our customers will want to use, to progress their missions and makes them easier to understand by exploring them in the context of real-world business problems.
The kinds of things I think about with my morning coffee are: what does the world look like when our customers, who today tend to have somewhere around, usually three or seven are the numbers I get, machine-learning models in production at scale; what happens when they want to have 1,000? How do they effectively manage to give data scientists access to all the tools that they want, which are evolving very quickly, while at the same time ensuring the security of their data?
These are the kinds of questions we think about and the kinds of business problems. One of the great privileges of this position is getting to see people taking data off of airplanes, all the way to manufacturing environments to banks, looking at things like anti-money laundering algorithms. It’s fascinating.
Can you talk specifics? I understand a lot of times you can’t, but are there any stories you can tell us of one of those fascinating things that came to you and you came up with a solution and it was like, wow?
Yeah, we have a bunch of stories. Some of them are super exciting, very forward-looking. Things for example [like] working with a surgical robotics company to help imagine the future of their surgical robots, by solving a research question, which was: Given video coming off endoscopes during surgery, is it possible to use machine learning, in this case deep learning, to figure out when a surgery is going awry? Therefore, eventually to build a product that can help a surgeon actually have better outcomes? This is the kind of question we love because it’s something where we don’t know if it’s possible at the beginning, but we can put a really nice process around it and one by one, answer questions that get us to a useful outcome.
I’ll give another example, if that’s okay, which is on the complete other side of the space where we work. We had another customer where we worked hand in hand with them - they’re a bank - to really reduce the cost of operating their call center and help their customers get better outcomes without having to call in at all. That one I find fascinating because the work that happened was not mathematically interesting. It was fairly straightforward data science; no new math was created to do this.
The challenge in this project was really integrating a bunch of systems together and then effecting change. Taking a year’s-worth of calls into a call center and doing speech to text. Taking all of that text and clustering it. Getting experts to look at those clusters and label them and figure out what people were actually calling about and what the best interventions were. Connecting that into a call center software system so that when people called in, it would give them access to those recommended interventions and then collect data from those folks to refine the quality over time. Then eventually, proactively reaching out to customers, to help them when they had a problem before they even got around to even calling in.
This is a very different kind of data science problem where, again, the math is very straightforward. It’s been well understood for decades, but the outcome is incredibly impactful. Here, I’ve given you two. One that was a true R&D question and one that was really about organizational effectiveness and identifying the outcomes that people care about.
When I hear you tell these stories, I think to myself, our descendants are going to look back at us and think we waltzed through our lives like drunken sailors on shore leave, just making decisions based on anecdotal evidence. Then there’ll be this clear demarcation at the point at which we mustered the power of data, to build it like a collective memory for the whole planet.
I can’t look at anything in my life anymore and not realize I’m probably doing this really poorly and yet, there is actually a solution to it. I just have to think, in that gap is an enormous amount of human potential and satisfaction. Do you ever get struck by that? That everywhere we look we’re surrounded by places that data could shine light [on]?
Absolutely. If you just think about things like simple home repairs, this is something that some people know how to do. You could point your phone at it and have an augmented reality demonstration of ‘move this bolt this way that nut that way.’ Maybe you can actually do some plumbing without knowing really what you’re doing. I think there are many opportunities to augment our own skills in many different areas with that data and information, but with the caveat that that takes away some of the fun.
Fair enough, fair enough. Also, I like to think about the other side of the equation, which is: you can show a child a line drawing of a cat and they’ll say, “oh there’s one.” Then they’ll see a Manx cat without a tail and they’ll say, “There’s a cat without a tail,” even when you never told them there was such a thing as a cat without a tail. Yet, there’s some essence of ‘cat-ness’ that those two ears and those eyes and those whiskers, that that toddler learns effortlessly. Yet, our models need so much data to train and they make the simplest mistakes.
Are those the kinds of things you think, ‘how do humans train themselves on sample sizes of one and why do our models need so many?’ Do you have thoughts on that?
One-shot learning is a very active area of research for just this reason, I think, which is that in many real-world applications, we don’t have those large, clean, rich datasets to rely on. Yes, I absolutely find it fascinating that human beings have this ability to recognize patterns with such a small relative amount of data, and the machines we build don’t necessarily share that.
I’m an optimist about the future. Anybody who reads my writing [knows] I believe we all use these technologies to increase the productivity of everybody and that it will usher in a new golden age of humanity and all of the rest. Yet, we’re all aware of the potential for abuse of this technology as well. What’s interesting is if you take a nefarious use, like ‘hey you could listen to every phone conversation,’ the same exact tools we use to build good things can be used in these bad applications and so forth.
How do you balance the good uses of this technology that are pro-human and then there are ones that tear down? How do you think that’s going to play out in the future, and are you an optimist?
I am an optimist but I also believe it is our responsibility as people creating this technology, to create it in a way that generally has a positive impact on people who will be using it. I think that doing this well is really a matter of who is empowered by the technology, who’s in control of it; who has a right to understand which decisions are being made by it; and who can appeal those decisions; and does a system like that even exist?
I can tell you that many systems today do not take those questions into account and are designed explicitly to be exploitative. As people building AI or data science technology, it’s our responsibility to make sure that we are not the ones building those things.
How do you – put some flesh on those bones. How do you do that? How do you make sure that everybody....
I think we have to change the modern practice of AI such that when you build a system that impacts people, there is an expected step or two in that process, where you are trying to understand and evaluate that impact. If you’re interested in specifics or if folks listening are, I co-authored a very short book on this with D J Patil, who was the Chief Data Scientist for President Obama, and Mike Loukides is from O’Reilly [Media], designed specifically for technical practitioners to think about steps that they can take to mitigate potential harm of the systems they build. Now that will only get us so far as to control things that you could potentially foresee. There are always going to be unintended consequences, and I think we need tools and means to appeal and manage those as well.
I noticed you seemed careful in your choice of words about these other steps – I’m a bit of a doubter about explainability, personally. If you went to Google and said, “When I do a search on Akron, Ohio, pool cleaning, I come up number five and my competitor is number four. Why are they four and I’m five?” I would think the folks at Google would say, “There’s no way to know out of 15 million pages why they’re four and you’re five.” The number of variables they go into, it’s 800, it’s 1,000, it’s 1200 and it changes dynamically with the rising of the sun.
Do you think AI systems are inherently explainable? Or if we put that burden on them, are we not inhibiting the advancement of the—‘if you can’t explain it, we can’t do it’—would inhibit it?
I do think there are applications where explainability is a requirement and therefore, if you can’t get a reasonable explanation, you should not be using that particular capability in the system. I don’t think that’s true of most AI applications. I do think explainability and interpretability are important for general practice. It’s quite possible to build a system that’s doing something less than optimal, even if it happens to be scoring very well on your test datasets. This is just one tool to help you figure that out. It’s not just about explainability for avoiding bias, it’s explainability for building the best possible system.
On that other topic, though, there are other ways to do bias testing that are statistically significant and robust, and again, have been useful ‘in the wild’ in understanding situations where nobody intended to encode a bias, but it ended up being expressed in the end result anyway. Things like showing higher paying job opportunities to people with traditional male names over female names. I doubt anyone ever programmed that in explicitly, but the fact that it happened and the fact that we can understand that it happened, means that we now have an opportunity to correct it.
We’re running out of time here and you have all of these other things that you do. What’s another project or initiative or story you would like to tell before we wrap up here? I feel like I’m shortchanging all these other things that keep you so busy.
I am always getting into trouble and I have a lot of hobbies. Well, if I can add one more quick plug, I’d say check out the Anita Borg Institute for Women in Technology. They support the Grace Hopper Conference, which is, I believe, one of the biggest events for women in technology and have numerous other programs for supporting women at many levels of their career. It’s a wonderful organization.
Alright, Hilary Mason, it has been a fascinating half hour. We could go on indefinitely but you have to go do all the stuff that you do. How can people keep up with you personally and your writing and whatnot?
Thank you so much and come back and visit us again.
Thank you, Byron. This was a lot of fun.