In this episode Byron and Dr. David discuss evolutionary computation, deep learning and neural networks, as well as AI's role in improving cyber-security.
Dr. Eli David is the CTO and co-founder of Deep Instinct as well as having published multiple papers on deep learning and genetic algorithms in leading AI journals.
Byron Reese: This is Voices in AI, brought to you by GigaOm. I’m Byron Reese. And today, our guest is Dr. Eli David. He is the CTO and the co-founder of Deep Instinct. He’s an expert in the field of computational intelligence, specializing in deep learning and evolutionary computation. He’s published more than 30 papers in leading AI journals and conferences, mostly focusing on applications of deep learning and genetic algorithms in various real-world domains. Welcome to the show, Eli.
Eli David: Thank you very much. Great to be here.
So bring us up to date, or let everybody know what do we mean by evolutionary computation, and deep learning and neural networks? Because all three of those are things that, let’s just say, they aren’t necessarily crystal clear in everybody’s minds what they are. So let’s begin by defining your terms. Explain those three concepts to us.
Sure, definitely. Now, both neural networks and evolutionary computation take inspiration from intelligence in nature. If instead of trying to come up with smart mathematical ways of creating intelligence, we just look at the nature to see how intelligence works there, we can reach two very obvious conclusions. First, the only algorithm that is in charge of creating intelligence – we started from single-cell organisms billions of years ago, and now we are intelligent organisms – and the main algorithm, or maybe the only algorithm, in charge of that was evolution. So evolutionary computation takes inspiration from the evolutionary process in the nature and trying to evolve computer programs so that, from one generation to other, they will become smarter and smarter, and the smarter they are, the more they breed, the more children they have, and so, hopefully the smart gene improves one generation after the other.
The other thing that we will notice when we observe nature is brains. Nearly all the intelligence in humans or other mammals or the intelligent animals, it is due to a neural network and network of neurons which we refer to as a brain — many small processing units connected to each other via what we call synapses. In our brains, for example, we have many tens of billions of such neurons, each one of them, on average, connected to about ten thousand other neurons, and these small processing units connected to each other, they create the brain; they create all our intelligence. So the two fields of evolutionary computation and artificial neural networks, nowadays referred to as deep learning, and we will shortly dwell on the difference as well, take direct inspiration from nature.
Now, what is the difference between deep learning, deep neural networks, traditional neural networks, etc? So, neural networks is not a new field. Already in the 1980s, we had most of the concepts that we have today. But the main difference is that during the past several years, we had several major breakthroughs, while until then, we could train only shallow neural networks, shallow artificial neural networks, just a few layers of neurons, just a few thousand synapses, connectors. A few years ago, we managed to make these neural networks deep, so instead of a few layers, we have many tens of layers; instead of a few thousand connectors, we have now hundreds of millions, or billions, of connectors. So instead of having shallow neural networks, nowadays we have deep neural networks, also known as deep learning. So deep learning and deep neural networks are synonyms.
So I’m going to want to get into your stock in trade of security and all of that in a moment, but I would love to just spend a little time here talking about intelligence as a concept. Do you think it’s telling that we don’t really have a consensus definition of what intelligence is? And, follow up, how do you even think of intelligence? You’re using the word a lot, but what exactly is it?
This is one of the most controversial definitions in the history of computer science to this day. The original examination of intelligence was developed by Alan Turing, one of the fathers of computer science, and it’s commonly known to this day as the Turing Test. Basically it’s if you cannot distinguish between a real human and an artificial brain, then that thing is intelligent.
Now, today, many people do not completely agree with this definition because this definition only looks at the exhibited result. For example, if we take the Deep Blue chess-playing program that in 1997 defeated Garry Kasparov, the world chess champion, on one hand, it exhibited intelligence. It defeated the world chess champion. On the other hand, if we look under the hood, it was just one big stupid calculator, just a big chess calculator. So, under the hood, it was not smart, but the result was intelligence.
Today, some researchers still believe that we must define intelligence as the exhibited result regardless of what is under the hood, and a growing number of researchers believe that to deem something as intelligent, it also must have intelligence under the hood, which we define as learning capability. So if you compare that Deep Blue chess-playing machine to a human chess-playing machine, that chess calculator could only play chess. You cannot teach it to do anything else, no matter how easy that thing is. But a learning brain is capable of learning additional things. So, today, more and more researchers believe that any definition of intelligence must involve learning as well, and learning is one of the cornerstones of intelligence.
So there’s two things wrapped up in that, because you think of a NEST Thermometer, a “Learning Thermometer,” and it does learn in the sense that the more I use it, it’s going to cool and heat my home differently than it would heat your home; it “learns” my preferences and doesn’t learn my preferences. But that isn’t what you were talking about, because you were also, it sounds like, drawing a distinction between a narrow and a general AI because you were saying all it knows how to do is play chess; it can’t learn something else. So do you mean learning in the NEST example or learning in, like, the AGI example?
The NEST example, actually, it is certainly learning. Now, when we say learning in the case of intelligence, we usually think of human-level learning. But even the simplest kind of learning, like the example you mentioned, that is certainly learning. It is not human-level learning. It’s certainly not an impressive learning. But it is definitely learning. So, instead of defining it as a binary yes or no for the question “Is it intelligent?” we can see a very wide gray scale, from no learning whatsoever, which is your pocket calculator at one end of the scale, at the extreme other end of the scale, imagine an advanced primate, such as humans, the best example. And we have this huge area in between that we can put various degrees of learning.
So our machines that we use on our desktop have a Von Neumann architecture. There’s memory, there’s fast memory and slow memory and there’s a CPU that exists different than that, and there’s input, there’s output. But the brain doesn’t act like that, right? It co-mingles memory and processing. Does that influence, in any way, how you go about achieving artificial intelligence in your mind? Is it the case that we only want to know how human intelligence works, but we’re not necessarily going to emulate how it comes about? What’re your thoughts on that?
The architecture is certainly different. If you look at the computer of Von Neumann architecture, you have the storage, you have the memory, you have the CPU. Now, if you look at our brains, we have just a single thing: neural networks. All our beliefs, thoughts, memories, feelings, everything, it is just the result of the electricity that passes between the neurons in our brain.
Now, for tens of years, the prevailing thoughts within the artificial intelligence community was that it’s a waste of time to try to look into our brains. We can do much better if we just find better mathematical methods for creating artificial intelligence. And until a few years ago, that was the prevailing domain. Let me tell you a personal anecdote. Ten years ago, when I wanted to teach a dedicated course on neural networks at a university, back then the department chair agreed that I should teach that course, but with the only condition that the name of the course wouldn’t be Neural Networks. It was an extremely unpopular thing to say ten years ago that you were doing neural network. It was almost a refuted field to that extent.
But what happened is that the moment we managed to improve the algorithms a few years ago and the moment we were capable of parallelizing neural nets and better hardware, mainly in videos and GPUs, suddenly we see that these nature-inspired algorithms worked much better. And even though we’re still running these neural networks on the Von Neumann architecture, the fact that we are, not mimicking, of course, but taking inspiration from biological brains and having kind of vastly parallel small processing units, and they work in some way that is a bit reminiscent of how our brain works, we see the improvement, the huge improvements. In the past few years, we have seen a nearly 20-30% improvement in most benchmarks of computer revisions, speech recognition, text understanding, and these are the kind of benchmarks where using traditional machine learning and learning AI, we were used to seeing half a percent, 1% improvement a year with the best domain-specific knowledge. If you can imagine, people have spent tens of years perfecting methods for face recognition, and suddenly somebody comes up with a deep neural network that doesn’t rely on any of those methods. It just looks at the raw pixels without any pre-processing, without image processing, and the result is a 20-30% improvement. This is a big revolution of deep learning and it is certainly the greatest leap in performance in the history of computer science and artificial intelligence that were witnessed in the past few years.
So this certainly proves that, in contrast to what we believed for many tens of years, the correct way is apparently taking inspiration from biological brains, and even though the hardware architecture between our brains and the Von Neumann machines is different, still taking inspiration and trying to implement them as closely as possible, that’s the way towards creating artificial general intelligence.
But, to be clear, we don’t really know how neurons do their thing, right? One neuron could be as complicated as a supercomputer when we get right down to it. We don’t know how thoughts are encoded, how memories are recalled. I mean, I struggle even understanding how a neural net has anything in common with the brain other than the name, and so make the case that “Oh, no. We understand how the brain works and how the mind works and how it is that we’re intelligent, and we’re taking key concepts from that and instantiating it in a machine.”
It’s a very good point. We certainly do not understand well how the brain works. That’s why it would be wrong to say that we’re trying to mimic the brain, and, at best, it would be accurate to say that we’re trying to take inspiration from our brain. But let’s see what we do know from the brain, or, better put it, we think we do know about our brain. The neurons in our brain, they receive input, electricity input, from other neurons, they aggregate these inputs and then they fire the outputs towards other neurons; and learning is done by optimizing the values of the connectors, the synapses, making stronger connections or weakening them for weaker connections.
In artificial neural networks, we’re doing more or less the same process. The neurons are connected to each other, they receive inputs from others, they aggregate them, they fire them to other neurons, and all the learning and training is done by optimizing the values of these connections between the neurons, the artificial synapses. On top of that, all of us in high school, when we were shown a map of the brain, we were shown “Well, look, this is the visual cortex of the brain in charge of processing visual data; this is the auditory cortex,” etc.
But what we do know, and we’ve already known this for the past 30 years, is that is not very important. The only reason why we have a visual cortex in the certain part of the brain is because the optic nerve, the cable that brings the data from our eyes, ends up there. So, if you switch the places of the optic nerve and the auditory nerve and connect them in reverse order, the visual cortex will become the auditory cortex and vice versa, and there’re many experiments proving this. So, what we know about our brain is that our neurons are general processing units. The neurons in our brain are not born to process visual data or auditory data, etc. There are general processing units and they learn to process what they receive. This is, again, very similar, or reminiscent of what we do in artificial neural networks. In traditional machine learning, we are using features that are crafted for specific problems. If you would like to do face recognition, you must find the features for this task, the distance between pupils, distance between the nose and the mouth, proportions of the face. You do lots of processing for that domain.
Now, if you would like to do face recognition with deep neural networks, you just feed the raw pixels into the deep neural networks without any net processing, without telling it anything about the problem, and assuming that by being exposed to enough data, our artificial neurons will adapt themselves to better and better process this kind of information—and we’ve seen the amazing breakthroughs and the huge improvements in recent years. So, while it is certainly wrong to say that our artificial neural networks are mimicking the brain, or even similar to the brain, it is accurate to say that they’re loosely taking inspiration from our brains, and there are some parallels between them and our brains, and the results show that, well, it is working.
Nobody can dispute that. And, so, last question about intelligence and the brain, and then let’s move onto security, what do you think of the human brain project, which is, of course, a multi-billion-dollar project to do more than take inspiration from, but to essentially build a computer brain modelled after a human?
Well, first of all, I think I am still in the minority of the researchers, but I firmly do believe that within our lifetimes, we will see computers that are as smart as ourselves, or, of course, much smarter. Our brain is a bunch of neurons. Our brain is a kind of computer. For everything that we know today, our brain is just a computer. It receives input, it processes it and there is output. So, there is no theoretical reason why we cannot create something equal, or, of course, much, much better than that. If we compare, for example, our brain to that of chimpanzees, our brains are extremely similar, the same six layers in the neocortex, the same kind of connectivity, the neurons look the same. The main difference is that if you crack open our neocortex and spread it on the table, it will be the size of a table napkin; if you crack the chimpanzee’s brain and spread the neocortex on the table, it will be the size of a business envelope. So, mainly, it’s a matter of size, it’s a matter of quantity, not quality.
And, by the way, we see the same in the artificial neural networks as well. Despite all the great improvements in deep learning, most of the algorithms today that we use are the same as those we used in the late 1980s and 1990s and early 2000s. So, why are today’s deep learning methods working so well, whereas in 1990s and early 2000s, it didn’t work well at all? It’s mainly the size, due to the hardware improvements and some algorithmic improvements we’re managing to train much bigger neural networks. So, here, again, we see it’s a matter of quantity much more than quality. So, for this reason, I certainly believe that we can create a brain or even a better version of that.
Now, regarding the human brain project and trying to understand it, I’ve been talking and very interested, and some of my students did research on using deep learning for actual brain research, my feeling is that the improvements in actual human brain research is too slow. We still don’t understand it very well and we have a problem measuring it. For example, even the most advanced extensive kind of measurement can measure small areas of the brain that contain millions of synapses. So we are far, far away from being in a state in which we can measure actual synaptic activity in the brain.
So, while I really do hope that we can fully understand the brain – and, in that case, we will just copy it as an artificial brain, problem solved. Of course, exaggerating a bit. I think we are much further away from fully understanding the brain than from creating an artificial version of it. So, if I had to make a wild guess, we will manage to create an artificial brain that is as smart as us way before we manage to fully understand our own brains. Of course, this is a very wild prediction. If my prediction in 30 or 40 years from now turns out correct, I will certainly keep a recording of this interview to show it to everyone, and if my prediction is not correct, I hope nobody else keeps a recording of this prediction.
So, I know I said that was my last one, I have to ask though, you essentially seem to be saying that the reason we are having successes is essentially we now have bigger neocortexes in our computers, I mean, effectively. And by saying ours is the size of an unfolded table napkin and a chimp’s is small, that seems to me that you’re saying you believe general intelligence is a direct correlate to the size of a neocortex, as opposed to saying “I think there’s some kind of emergence, strong emergence, weak emergence, that happens, and intelligence kind of rises out from that.” So, to make that a question, is general intelligence simply a linear ability that will come from effectively a larger neocortex in the computer, or is general intelligence an emergent property that we don’t really understand how it comes about in us, let alone how it would come out in a machine?
Giving any confident answer to anything that has to do with brain and intelligence would, of course, be foolish. Many of the things that we confidently—
I don’t think I asked that very well. Let me try it a different way. Do you believe that there is a master algorithm? So you know the whole story about when AI first—when the term was created, they said “Hey, we can probably solve this in a summer if we worked really hard,” and core to that belief was a belief that intelligence would turn out to be something like the laws of physics, you know, like Newton comes up with three laws and they kind of explain everything, and the trick is coming up with those laws—electricity a few laws, magnetism a few laws. Do you think intelligence is like that? Is it that there are just a few tricks that neurons do, and once we can replicate something like that, we will have a general intelligence?
I do think so. I do think so. By making wild extrapolation from what we learned in the past thirty years of research, even in the artificial neural nets, everyone within artificial intelligence and traditional machine learning was confident, until a few years ago, that the kind of algorithms we use for training neural networks are extremely inefficient. And they may be. I do think they’re inefficient.
But the same seemingly inefficient algorithms that at best obtained mediocre results until a few years ago, when we suddenly make the quantity bigger, they give a complete knockout to every other algorithm there is within the traditional artificial intelligence and machine learning. So, are our neural networks inefficient today? I certainly do believe. We’re doing many things that don’t make sense, but we do them because we don’t have any better alternative. In order to create artificial general intelligence, would we need a series or many series of major breakthroughs in the field of neural networks and AI, I certainly do believe that. But I do think in parallel to these qualitative improvements, quantity in itself plays a very important role.
Now, today, many people compare the best neural networks we have with humans, to reach the correct conclusion that, well, they’re far, far less intelligent than us. It does make sense, but let me give you another comparison. In our cerebral cortex, the main cognitive part of our brain, we have about 16 billion neurons, each one of them connected to an average of ten thousand neurons. So we multiply 16 billion by 10,000, we have roughly 160 trillion synapses, connectors, in our brain. Now, in artificial neural networks, the largest neural networks in deep learning that we can train today have just a few billions of synapses. So the best hardware today is still more than 100,000 times inferior to the hardware we have in our brain. On the other hand, to quote Geoffrey Hinton, the father of deep learning, he recently mentioned that in the 30 years of research, he managed to make neural networks a million times better, ten times algorithmically, and meanwhile the hardware became 100,000 times faster.
So instead of training neural nets with just a few thousand synapses, now we train with billions of synapses. What would happen in 20 or 30 or 50 years from now when our largest neural networks, instead of having billions of synapses, they would have hundreds of trillions of synapses? I do expect mind-boggling improvements that we cannot even imagine. And I do think that what we call consciousness and self-reflection, that will also be a side effect of very large neural networks, because maybe there are other things… But, today, when you look into a real human brain, you cannot find anything else except neurons, synapses and electricity, and if you compare the brains of the most advanced mammals as far as intelligence is concerned – we think this is our humans – to the least advanced mammals with the smallest brain, the biggest difference is the number of neurons and synapses. And, by the way, a book that I highly recommend in that regard is the book The Human Advantage, written by one of the leading neuroscientists, a Brazilian neuroscientist. She wrote a very eloquent book, and she describes in the book her journey in neuroscience that she wanted to find the answer “what makes humans special.” I highly recommend this, but just to give a spoiler, the conclusion is, it’s mainly quantitative. That’s the difference between humans and all the others. We simply have more neurons in our brain. So, back to your answer, I think I would certainly be in the “yes” camp to your question.
Well, excellent. I mean, I would love to take that thread and run with it, ask you what you think about sperm whales who have, you know, brains much bigger than ours, ask if we really are the smartest on the planet, ask about the internet and the size of the internet as a whole—could it already have intelligence, we could go down all of those paths, but, time, of course, doesn’t permit.
But what you know so well is about security, so start off by telling us, before we dive into Deep Instinct, tell us just what’s going on in the world right now with regard to security; who’re the bad actors and what’re they doing and how do we defend against them and just paint the picture of what the world is today. Scare us a little bit if you would like to.
Certainly. Let me start by an image analogy. Imagine I show you a picture of a dog, that you recognize it easily, everyone recognizes it easily. Now, imagine I modify just two percent of the pixels in the image, one percent of the pixel, I slightly modified it, and then you don’t recognize it at all. Nobody recognizes it at all. It sounds preposterous, right? But this is the state of malware development in cybersecurity. We have more than one million new malicious files developed every single day. These are news files every single day, more than a million. Now, we have many solutions. For example, I guess, all of us have an antivirus on our device. These solutions are developed mainly for protecting us against currently-existing attacks. They are not designed and they’re incapable of protecting against most of the new attacks, and if you’re the attacker and you take a well-known malware that everyone detects it, you just need to modify it slightly. It takes somewhere between a few minutes to a few hours, at most, usually a few minutes, and you have a new mutation that nobody detects it. You can infect everyone. And these are not advanced malwares.
You have much more advanced malware. You have the nation-state malware, a part of the international cyber wars. The nation state developed them. So if we look at the trajectory in the past few years, we see a huge exponential increase in both quantity of new malware and the quality and sophistication of them. So it’s kind of a cat-and-mouse game in which the attackers have the upper hand. So I wish I didn’t scare you too much, but that is the state of affairs.
Well, let’s talk about the very last part of that, which is, you know, there’s been this struggle between code breakers and code makers for 2,000 years – and there’s not even a settled question is it easier to make an unbreakable code or is it easier to break the unbreakable code – if you’re right and if right now the bad actors have an advantage, why isn’t the world overrun with viruses? Why doesn’t every single thing get stolen? Every time I log on, why aren’t 50 beings taking all of my credit card information? Why aren’t 100 computers probing my network? Or are they? So why is there anything right now… If it’s so easy, if there’s a million new ones a day and they slip right past all the software, then effectively we don’t have any security anywhere on the internet, but that doesn’t feel like that’s the case.
It would not be a wild exaggeration to say that for many organizations and consumers, individuals, there is nearly no security if we are targeted, but, still, the amount of users, the amount of computers on the internet is so large that even these vast amounts of malware still affect not everyone. However, a few years ago, you would really worry about the next-generation malware, the advanced malware, if you were an organization harboring sensitive data, if you were a bank, for example. Today, every organization knows that it can be a victim of this kind of attack, whether you are the big Fortune 500 company, a small company, a medium business, an individual. For example, during the past year, we saw some of the worst new ransomware that resulted in hundreds of millions of dollars of damage to companies like Maersk, the largest shipping company in the world, to Merck, one of the largest pharmaceutical companies in the world, to FedEx, etc…, and the worst part, these ransom-ware that caused huge amounts of damage, they were not targeted ransom-ware. Nobody targeted Maersk. Nobody targeted FedEx. It was a generic malware that just infected companies that it stumbled upon, and this was the kind of damage it caused. Now imagine the amount of damage that could be caused if your company is specifically targeted. That would be much, much worse.
But, again, to ask the question, why can we do anything online? Banks have a lot of money. Why doesn’t every bank get hacked into 100 times a day or 1,000 times a day and everybody’s account just drained? Why can we do anything online if the situation is that bad? Is it because there’s few of these bad actors or what?
Actually, banks are the most frequent targets. Every malware developer knows that “Well, banks are where the money is, so I should try to attack them.” But on the other hand, especially in North America and Europe, banks are some of the best protected companies. If you’re a bank, you certainly do have an antivirus, but in addition to antivirus, you have a long list of other solutions to protect your network, your communications, your anomaly analysis, and especially you do have a solution that’s called a next-generation malware detector. While traditional antiviruses are developed for detecting currently-existing malware, during the past few years, there’s an entire new domain of solutions that are designed to detect and prevent these new malware, these new mutation, new nation-state attacks. And not only banks – nearly every bank certainly has this kind of solution – but today most Fortune 500 companies have at least one of these next-generation solutions that are designed for detecting new malware, and that’s the reason why you don’t see very frequent damages with these big corporations.
So, these bad actors around the world, what are most of them? Are they individuals? Are they criminal organizations? Are they nation states? Are they lone people who do this on their own? Who do you think is behind it all?
You have every kind of people from all the spectrums. Certainly you have many individuals. Developing a new ransomware is extremely easy. You can just take some open-source code that encrypts the files, add some bits and pieces from here and there, and congratulations, you have your own ransom-ware, you can infect people’s computers and ask for Bitcoins and get your money. That’s the good incentive, actually the bad incentive, but an obvious incentive for many individuals. Further than that, you have criminal organizations. A criminal organization can be something as small as three people and as big as tens of people, that instead of just attacking individuals here and there, they target companies, they target corporations, and the kind of ransom they can receive is, of course, much higher. And at the extreme end of the sophistication, you have nation-state players, some of the most technologically advanced countries in the world that have large teams of malware developers that can range from hundreds to thousands of people. They develop professional solutions that are designed for targeting, compromising networks, obtaining information, doing damage, etc. So you have this entire spectrum from a single teenager attacking people for some Bitcoins up to nation states.
Do you think that the infrastructure in the United States—it’s been described as highly vulnerable. Would it be possible or is it likely that other countries could take out the power grid, then take out water purification, and then is GPS vulnerable? Are all of our systems just essentially… that any number of people around the world could bring them down, or is it more complicated than that?
First of all, my impression is that the United States infrastructure is one of the most secure in the world. Of course, I’m not familiar with the critical infrastructure, but just by judging from how the US corporate behave, I can tell you that two years ago, when we were talking to European companies about next-generation solutions, about our deep-learning-based solutions for detecting new malware, they were reluctant. They said “well, on our endpoints, we have antivirus. We’re not sure we need this kind of solutions.” That was two years ago! At the same time, two years ago, nearly every Fortune 500 company in the US was already actively pursuing and testing these kind of solutions. Just today, this year, we see that Europe is also interested, and we see a ramp up of our sales in Europe as well. So, for all the technologies of cyber protection and defense, we usually see the United States being at least a year or two ahead of other countries in the western world, and more protected and trying to be more protected.
Having said that, there is no such thing as 100% protected. The more relevant solutions you have, or each kind of solution – for example, you have endpoint protection, you have network protection, etc, etc – for each category, you do professional testing and you bring the best solution you have there, and you have your array of many different solutions, each one of them best of breed, you just increase the probability that you would withstand an attack. In other words, what it means is average people cannot attack you. You need more sophisticated attackers and more motivation, spending much more time and energy and effort trying to penetrate. So, while I don’t think there is any infrastructure or any country in the world that is fully protected against attack, I do believe that countries such as the United States that are very cyber-conscious and spend lots of resources for defense, are among the most well protected probabilistically in comparison to others.
So, you know, Stuxnet from, say, back in 2005, which was a malicious worm which could turn on and spin up centrifuges and cause them to overheat or just destroy them, are all these devices we plug into the internet every day – you know, a million new, whatever, internet-enabled toaster ovens – is all of that vulnerable to that kind of an attack? Because those systems aren’t generally upgradable, right? Like, if you find a vulnerability in one, then there’s no way to fix it; like every one of those toaster ovens is just a bomb waiting to go off. Is that true?
It is. It certainly is. Stuxnet was a good example. Even way, way before Stuxnet, we knew that cyber-attacks can not only be used to steal information, data, etc., they can result in real-world physical damage. Stuxnet one was a prominent example of that and this increased the awareness in the world that cyber-attacks could actually result in damage. And, today, if you look at the IoT devices, nearly everything—our TVs are smart TVs, our refrigerator is a smart refrigerator. We are running out of dumb things. Everything is becoming smart, that is connected to the internet, that is running a computer within it, that is vulnerable to attack, and the IoTs are the least protected of all devices today. In most of the IoTs that are developed today, still protection and cyber defense is, at best, an afterthought, and as we accurately mentioned, when a vulnerability is detected, it takes a very unacceptably long time until it is caught, if at all. In many occasions, it is simply not caught at all. So IoTs are currently some of the most vulnerable areas that we have.
So, setting all of that up, tell me a bit about Deep Instinct. The website makes the claim that it is the first company to apply deep learning to cybersecurity. Is that true? And what does Deep Instinct do and how do you do it?
Sure. We founded the company three years ago based on the same belief that wherever deep learning is applied, the results are very big improvements over the previous state of the art. My background is academic avenues in deep learning for vision, speech, text, games. Everywhere I applied deep neural nets, I saw very big improvements.
So our hypothesis was that if we managed to apply deep learning for cybersecurity as well, we will substantially increase the detection rate of new malware. And, to do that, we modified the standard neural network algorithms to be able to process computer files, we just process the raw byte values of the files. So if we have an executable file or PDF file or Office file, we’re not trying to analyze them and extract features or do some generic analysis, we’re just feeding raw byte values directly into the neural networks. In other words, we’re treating computer files as if they were images. Instead of pixels, they have a bunch of bytes. And the moment we’ve done that, we’ve created a dedicated deep-learning model for that, and we train it in our laboratory on many hundreds of millions of malicious illegitimate files, the results we obtain, and we demonstrated to our customers in the tests that they conduct from their own independent data, they do show a very substantial improvement in detection rates and substantially lower false-positive rates. So, what basically we have established in the testing is that we’ve just shown that cybersecurity is essentially not much different than computer vision, speech, text, etc. Similar to those areas, in cybersecurity also, when deep learning is successfully employed, the results show a very big improvement in the accuracy.
And so how does your solution manifest itself? I mean, presumably it’s not something a listener would go and download. Like, what is it, actually?
First of all, applying deep learning in cybersecurity is much more challenging than vision, speech, text, etc. Just to give you an example, in computer vision, when you process images, you assume that the pixels are locally correlated. If a certain pixel is green, most probably adjacent pixels would also be green. If you open a computer file, an executable file, as if it were an image, you will see seemingly random black and white pixels. It looks like it’s random data. It’s not random. The correlations are simply not local. You have non-local correlations.
Also, what if you have images of different sizes—one image 100 x 100 pixel, another 200 x 200 pixel? Then you just resize them to a fixed size and feed them into your neural net. In cybersecurity, you can have a file which is 100 kilobytes, 100 megabytes or 100 gigabytes. You certainly cannot resize them. So, first of all, these were just examples of the kinds of challenges that we had to overcome to be able to apply deep learning in cybersecurity. The way we do it is we have two phases: the training and the deployment. During the training phase, we run our deep-learning framework in our laboratory, on our GPU machines. We have a training dataset comprising hundreds of millions of legitimate and malicious files. And during the training times, we feed these hundreds of millions of files into the brain, the deep neural net, and it tries to classify them: this is malicious, this is legitimate. So it’s a fully supervised training. It’s a bit similar to training the brain to detect if an image contains a cat or not. This is a not cat. Instead of cats and not cats, we have malicious and not malicious.
This training phase takes place a bit over 24 hours. Now, when the training gets finished, we now have a pre-trained neural net, that we can see this new file and it can provide a prediction for whether it thinks it’s malicious or legitimate. We then take this pre-trained neural net and we put a copy of it on each and every device that we protect. So, imagine you’re a large organization, you have 50,000 laptops, desktops and mobile devices, and each one of these devices, these endpoints, we put a copy of our brain, the pre-trained neural net. So, for any file that touches your file system, the file goes through the neural net, it provides a prediction for whether it thinks it’s legitimate or malicious, and if it thinks it is malicious, we immediately remove the file on the device. And the entire cycle, from the moment the file is observed to the moment it’s removed, takes a few milliseconds, before even the file starts executing. So it provides pre-execution prevention by putting the neural net on the device. And the moment we put the neural net on the device, it no longer trains. It’s just providing predictions. We have periodic training in our laboratory, and when we have a new trained version, a new trained neural net, we update all the agents of all of our customers with this updated neural net.
And, so, if you were a black-hat player, wouldn’t you use the same techniques to create your next malware as well? I mean, like, isn’t all of this going to be duplicated on the other side and the struggle goes on at infinitum?
It’s a very good point. Theoretically, it certainly makes sense. You can use deep learning to mutate the malware in a much more intelligent manner to evade detection. And we already have similar methods, not just cybersecurity, but, for example, with computer vision we have GAN, generative adversarial networks, and they allow you to automatically, for example, modify an image to fool a detector into thinking it’s a cat while it’s not a cat, or take an image of a cat and modify a few pixels and fool the detector into thinking it’s not a cat.
There is ongoing research in trying to apply these kinds of methods in cybersecurity as well, take a malicious file and modify it slightly and make all the other detectors think it is not a malicious file. But applying these methods in cybersecurity is much more difficult. For example, if you take an image and you randomly modify a few pixels, the resulting image is still an image. It’s a perfectly valid image. Now, if you take an executable file and you randomly modify a few bytes, most probably you’ve broken the file. It will no longer work.
So if we are interested in developing a new malware using deep learning, we’re not only interested in fooling the detector. We’re also interested in making our malicious file work and do what needs to be done for us, and that is much more complex here. Having said that, I certainly do expect that in the next few years, more advanced methods would be found and malware developers will also use deep learning to improve their attacks. Similar to every technology in the history of computer science, every new advancement, whether it serves at first the attackers or the defenders, after some time, the other camp is also using this to improve their tools.
All right, well, it has been a fascinating hour. Our time’s almost up here. If people want to follow you, how do they do that? Like, do you have a website or do you blog? And, likewise, tell us how we can find out more about Deep Instinct.
Certainly. First of all, I have my own personal website in which people can get updated. It’s mainly about my academic research: www.elidavid.com. And from there, there are links to my Twitter and LinkedIn account as well. And, for Deep Instinct, you can find a wealth of information on Deep Instinct’s official website: www.DeepInstinct.com. And, by the way, we just released a new book, Deep Learning for Dummies, that is also available for download, and you can download it at the Deep Instinct website.
All right, well, I want to thank you so much for taking time today. It was a fascinating and kind of frightening hour. So thanks so much.
Thank you very much. It was a pleasure.
Byron explores issues around artificial intelligence and conscious computers in his new book The Fourth Age: Smart Robots, Conscious Computers, and the Future of Humanity.