Episode 104 – A Conversation with Anirudh Koul

Byron Reese discusses the nature of intelligence and how artificial intelligence evolves and becomes viable in today's world.

:: ::

Guest

Anirudh Koul is a data scientist at Microsoft. He brings eight years of applied research experience on petabyte-scale social media datasets including Facebook, Twitter, Yahoo Answers, Quora, Foursquare, and Bing. He has worked on a variety of machine learning, natural language processing, and information retrieval-related projects at Yahoo, Microsoft, and Carnegie Mellon University. Rapidly prototyping ideas, he has won over two dozen innovation, programming, and 24 hour-hackathon contests organized by companies including Facebook, Google, Microsoft, IBM, and Yahoo. Koul was also the keynote speaker at the SMX conference in Munich (March 2014), where he spoke about trends in applying machine learning on big data.

Transcript

Byron Reese: This is Voices in AI brought to you by GigaOm, I'm Byron Reese. Today, my guest is Anirudh Koul. He is the head of Artificial Intelligence and Research at Aira and the founder of Seeing AI. Before that he was a data scientist at Microsoft for six years. He has a Masters of Computational Data Science from Carnegie Mellon and some of his work was just called by Time magazine, ‘One of the best inventions of 2018,’ which I'm sure we will come to in a minute. Welcome to the show Anirudh.

Anirudh Koul: It's a pleasure being here. Hi to everyone.

So I always like to start off with—I don't wanna call it a philosophical question—but it's sort of definitional question which is, what is artificial intelligence and more specifically, what is intelligence?

Technology has always been here to fill the gaps between whatever ability is in our task and we are noticing this transformational technology—artificial intelligence—which can now try to mimic and predict based on previous observations, and hopefully try to mimic human intelligence which is like the long term goal—which might probably take 100 years to happen. Just noticing the evolution of it over the last few decades, where we are and where the future is going to be based on how much we have achieved so far, is just exciting to be in and be playing a part of it.

It’s interesting you use the word ‘mimic’ human intelligence as opposed to achieve human intelligence. So do you think artificial intelligence isn't really intelligence? All it can do is kind of look like intelligence, but it is not really intelligence?

From the outside when you see something happen for the first time, it's like magical. When you see the demo of an image being described by a computer in an English sentence. If you saw one of those demos in 2015, it just knocks the socks off when you see it the first time. But then, if you ask a researcher it said, “Well, it kind of has you know sort of learned the data, the pattern behind the scenes and it does make mistakes. It's like a three year old. It knows a little bit but the more of the world they show it, the smarter it gets.” So from the outside—from the point of press, the reason why there’s a lot of hype is because of the magical effect when you see it happen for the first time. But the more you play with it, you also start to learn how far it has to go. So right now, mimicking might probably be a better word to use for it and hopefully in the future, maybe go closer to real intelligence. Maybe in a few centuries.

I notice the closer people are to actually coding, the further off they think general intelligence is. Have you observed that?

Yeah. If you look at the industrial trend and especially talking to people who are actively working on it, if you try to ask them when is artificial general intelligence (the field that you're just talking about) going to come, most people on average will give you the year 20… They'll basically give the end of this century. That's when they think that artificial general intelligence will be achieved. And the reason is because of how far we have to go to achieve it.

At the same time, you also start to learn as the year 2017/18 comes, you start to learn that AI is really often an optimization problem trying to achieve the goal and that many times, these goals can be misaligned, so if you try to achieve—no matter how—it needs to achieve the goal. Some of the fun examples, which are like famous failure cases where there was a robot which was trying to minimize the time a pancake should be on the surface of the pancake maker. What it would do is, it would basically flip the pancake up in the air but because optimization probably was minimized the time it would flip the pancakes so high in the air that it would basically go to space during simulation and you minimize the time.

A lot of those failure cases are now being studied to understand the best practices and also learn the fact that, “Hey, we need to be keeping a realistic view of how to achieve that.” They’re just fun on both sides of what you can achieve realistically. Maybe some of those failure cases and just keeping appreciation for [the fact that] we have a long way to achieve that.

Who do you think is actually working on general intelligence? Because 99% of all the money put in AI is, like you said, to solve problems like get that pancake cooked as fast as you can. When I start to think about who's working on general intelligence, it's an incredibly short list. You could say OpenAI the Human Brain Project in Europe, maybe your alma mater Carnegie Mellon. Who is working on it? Or will we just get it eventually by getting so good at narrow AI, or is narrow AI just really a whole different thing?

So when you try to achieve any task, you break it down into subtasks that you can achieve well, right? So if you're building a self-driving car, you would divide it into different teams. One team would just be working on one single problem of lane finding. Another team would just be working on the single problem of how to back up a car or park it. And if you want to achieve a long term vision, you have to divide it into smaller sub pieces of things that are achievable, that are bite sized, and then in those smaller near-term goals, you can get to some wins.

In a very similar way, when you try to build a complex thing, you bring it down to pieces. Some are obviously: Google, Microsoft Research, OpenAI, especially OpenAI. This is probably the bigger one who is betting on this particular field, making investments in this field. Obviously, universities are getting into it but interestingly, there are other factors even from the point of funding. So, for example, DARPA is trying to get in this field of putting funding behind AI. As an example, they put in like a $2 billion investment on something called the ‘AI Next’ program. What they're really trying to achieve is to overcome the limitations of the current state of AI.

To give a few examples: Right now if you’re creating an image recognition system that typically takes somewhere around a million images to train for something like ‘imageness’ in it which is considered the benchmark. What DARPA is saying, “look this is great, but could you do it at one tenth of the data or could you do that at one hundredth of the data? But we’ll give you the real money if you can do it at 1000th of the data.” They literally want to cut the scale logarithmically by half, which is amazing.

Right. But I mean a four year old can do it with three photographs, right?

True, true. true. These are open problems that we are starting to see. Zero shot learning is one of those, one shot learning or zero shot learning with the minimal amount of images that we can create a system to replicate. And you can see that if you want to go towards a bigger problem like artificial general intelligence. So on even the smallest, simplest thing which is like show three photographs to a toddler and solve it, is a remaining problem to be solved at this point in time.

And really what people can do that’s so interesting is that if you showed that toddler three cats, and then you went out for a walk and you saw one of those manx cats, you know the cat without a tail? The toddler would say look there's a cat without a tail. Even though they weren't even trained on the notion that there was such a thing. But evidently the manx has enough ‘catness’ about it, whatever that abstraction is, that the child still correctly identifies it is a cat without a tail. Do we have to understand how humans do that before we can teach machines to do it, or will machines learn to do it a whole different way?

This reminds me of the big news story from 2017 of the Google translator inventing its own language that was like the press statement. So basically the problem was: we need to translate language A to language B. And, you translate English to Spanish, so you need a data set and then you translate Spanish to French in another data set of two parallel corpuses. But now with the effective representation of each and every language in one way, you could actually just show it the parallel corpus of English and Japanese and maybe another corpus of Japanese and French and it would start to automatically translate English to French, even though we never created for that.

But the press translated that to be like it started learning its own language, and so there are explanations that can be given on how things are learned, but the interesting thing is how it can start to assimilate the example you gave of AlphaGo. Which was that it started to learn on its own based on simulations. So one think I will mention that if you look at the trends that are happening and where we are going towards in 2019, the big area of simulation to come over data scarcity is one of the big places how we have to be able to train and achieve some of this learning, and it has significant impact on robotics.

As an example, when you are driving self-driving cars, you cannot basically go and have every kind of accident and then relearn how to cope with it. Another example being AI systems that are built for earthquakes. You cannot go in an earthquake zone and just keep waiting for it and you will have a few samples. So simulating those is like a great way of learning it. Simulation will basically be best in achieving a lot of this.

So your particular interest is in using a high AI for social good especially for accessibility. So, let's start at the beginning of that story. How… of all the things in the world to do, why did you zero in on that? When did that happen?

Being in the field of machine learning for about ten years now, hackers tend to like to find interesting problems and being at Microsoft Research, there are plenty of people to learn and work with to build something new. When I came to North America in 2015, Skype video calls would keep me and my family connected internationally. Happy smiling faces appearing weekly over a screen. But especially with my grandfather, as he started to age it became pretty evident that his sense of hearing and vision were starting to decline. So, simple conversations had to be repeated louder. And my grandfather who was a lifelong educator, a professor, author and an avid reader was having a hard time reading books.

Finally one day he didn't recognize my face anymore on the Skype call. That was heartbreaking. So I started to look out for solutions that would help him brush up to see the state of technology in this system space. We live in a day and age of self-driving cars and on the other hand, technologies, for instance to me were feeling decades old.

When you can’t find a solution, you do the next best thing. You try to build it yourself. So I got started, met a group of like minded folks at Microsoft and we started to explore this idea of artificial intelligence for Microsoft Research. So one of the great things in Microsoft is this whole notion of hackathons. So Microsoft holds the planet's largest hackathon every year. Last year it had 22,000 people participating in 50 plus countries around the world. So these are employees where the CEO literally tells them go take a week off. Knock yourself [out] on the idea that you really want to achieve and try to achieve it.

A bunch of friends and I got started on this and we built a prototype of a talking camera and this prototype was literally a cell phone duct taped to the head MacGyver style. You could speak to it, you could ask it questions and it would give you answers. And so that started to show a lot of promise, and so from that we started to imagine how we can take this forward so became partnered with a company called Riverton, which makes smart glasses. They could now basically press a button on the glasses. It would take a photograph and a few seconds later it would give you an answer back describing the world.

From that, I got a lot of attention and it got viral on the internet. And so we started asking ourselves, “How do we get it in the hands of as many people as possible?” And that's how we launched Seeing AI, the talking camera app for the blind community. So Seeing AI is this cell phone app that is like a Swiss Army knife. It is small, nimble and has dozens of uses all the way from reading books, recognizing faces, describing faces, describing scenes. It’s telling you the color of your clothes, where you are at, a cashier telling you what is the cash that the cashier just handed over to you.

So the ultimate aim is that this is like a toolbox for a person who is blind to go and achieve things independently on their own. But what's really interesting in that is in this journey you know, you go and make technology, you get it in the hands of people, but then people come up and make stories out of it in their own life. So hopefully, we can later discuss some of these amazing stories and then how that led to me working on a new technology called Aira.

No, no, keep going.

So let me tell you a couple of stories. Even in your two books you're talking about the human side of technology and how it works with it. Here, the story is a mere couple of examples. We built a feature to explain where a face is in the screen in front of the cell phone. How far is the person located to the left to the right, one foot or 10 feet from you? And when you take the photo, it will tell you some attributes like gender, emotion and your age, you're trying to predict it.

So a smart salesman started to use this to change his sales pitch based on the customers he's meeting. I learned myself that when my wife used to get angry, I used to take a quick photograph of her to tell what bad things occur. But I learned never to do that again. So a quick example is how people find it useful. We build a system to create faces on the device for privacy reasons. So [if] you're basically a blind person, it could be a specific three photographs of someone and it will train a face recognizer live on the device.

So a professor took her phone, photographed her entire classroom, then put the phone on her desk looking towards her door. And now students cannot sneak in late to the class anymore because it announces they’re late. Now when we noticed it, people are creative, so they found a new use of currency recognition. They found that there's a guy in the middle of a five dollar bill called Lincoln. And so it's a president's face. And if you use face recognition you can literally say that that currency bill is a five dollar bill. So based on that hint, he'd be basically building a currency recognizer.

Another really good example is the facial recognition itself. When we launched the app we noticed something amazing. Blind users started posting photographs on Facebook that they took themselves. It’s like that hashtag #blindphotography just got real. And this is because you know when they go and meet a friend in a social occasion, they don't know how to frame their face in the center not too far away. And when they take the photograph and realize the friend is not smiling, they scold them and retake the photograph. And who doesn’t want a smiling photographs on Facebook?

Another really great example was, we asked people what's the thing that you need and they said, “Well, to be able to read text.” So we built a realtime text reading system on the app. And realtime is beautiful because if you don't know where the text is to begin with, you can just scan… say a hotel and learn your hotel room number, your thermostat number, the temperature on it or where the exit sign is. And then you start learning stories of how people are using it. People said that they never knew how much text are around them.

So two really great examples where people started to sit in the back of cabs pointing the cell phone outside and they started to learn new stores that have opened in their neighborhood. The good example was one user put the phone on a tripod pointed it towards his television and he started watching Korean movies because the app was reading the subtitles in English. To just highlight like the smallest particular use of AI. The user asked us, “Hey, could you help us recognize products?” And we said, “Sure. What's the big deal?” So we put on a barcode reader like a barcode reading library. There are thousands of apps like that. Just slap it inside the app and we said “Job done. You know we're happy.”

And then, we gave it to users and it turned out—it's rubbish. The reason is because he cannot see, how are you supposed to know… where the barcode is to begin with. That’s setting up yourself for failure. So what we learned from this experience is you have to keep the user ahead of the technology. So to solve it we trained an AI model that can recognize barcodes from far away. What it could do is to start beeping. So when it sees something that is kind of like a barcode, it could start a beep and the closer you get the more it beeps.

So you the user would take a packet of chips or a Coke and they would start moving it and when they start getting beeps they know, “Ah, maybe that's where the barcode is so they start bringing it closer.” And now if it's close enough to barcode reading library you can then decode the barcode and then a deal is a deal.

Now, to do that before, blind users couldn’t do [it], usually they would do that on a cell phone so they would buy a $1300 barcode reader. It shoots laser reads, the kind you see in Walmart. But now you can do that for free and that’s the part that AI can do. So this experience started that if you're solving problems completely through AI, the monumental effect that it can have in the lives of people achieving millions of tasks and becoming independent.

And that journey led me to the new company Aira, which now does a more interesting art which is ‘human in the loop.’ So it has humans who are interpreting the visuals of what a blind person is looking at from smartglasses, and now because humans are reacting to those new visuals we can see the reaction of the human agent and learn the human in the loop approach so that the AI will start getting smarter over months and years and eventually start to take some of those low hanging fruits. So that has been like an interesting journey over the last five years from 2014 from complete AI, to human in the loop, and hopefully I think, as the future comes in I think you're just starting to pick the low hanging fruits right now. So it's really exciting to be in this field.

So your company is a for profit company, and are you shipping a product or what are you building exactly?


We are working on… to basically build a platform which a user can use through an app, so you can download the free Aira app or you can wear the Aira smart glasses. The beauty with smart glasses is that it enhances the experience. So when you work for someone who is like 80, 90 years old who has never used a cell phone before in life, who doesn't know how to use it, is not technical, all they do is press a button and it instantly streams the view from your cell phone or from your smartglasses through a remote agent who is now interpreting and telling you and describing your world in your ear through your speaker. So it's kind of like the OnStar for the blind community. That's a good way to explain it.

Where are you in your product lifecycle?

The company started… the good thing was that you need the mixture of innovation in multiple places to happen. So luckily Google glasses came out in 2014/15 which basically sparked this whole idea. And so, we started with debuting and showing the concept to people and sort of giving it in 2017 and 2018 basically the whole revamp. And to be able to give these new smart glasses which are refined. So just to give an example, when we use Google glass the battery would die down in 30 minutes. So in a way, by creatively working on increasing that, now you can actually stream seven hours of live streaming video which is like the first in industry.

Another example is when you go to a stadium your cell phone goes out. Usually data transferred back in because of network conditions, but by implementing something like dynamic network prioritization you can use it in a stadium. So we have a user who went to the Super Bowl independently and had a great experience because you have innovations and connectivity on top of the Smart Glasses happening there.

Then on the the agent and AI side, we are basically playing active learning based games where agents are shown images and they have to come up and guess words related to the given image. And as these type of guess, they're actually playing a game against an AI agent whose learning from the agents and starting to guess and again get better and better at it. Similarly, when you're walking in a place like an airport you would have an agent navigate someone.

Maybe we can automate the problem by building a 3D virtual model of the world. And try to know where the objects are in this world and navigate the person autonomously. So those are the kind of problems that they’re working on. But the transformative effect we have seen of having this technology is the amazing part. A quick two or three examples: people experiencing their daughter's wedding sitting in the front seat and knowing about how the daughter walked down the aisle; someone going to a funeral and finding the tombstone on their own; someone going to the Super Bowl. We actually have a user who ran the Boston Marathon.

Some fun examples are like trying to find things that have fallen off. We had an example of an agent, a human agent, and he used his dog trying to find… having a game between who can get to the chocolate first, because you don't want the dog to eat the chocolate that has fallen off, right? So it's fun, interesting stories which have a deep emotional connection and then AI can be used here to help is the amazing part.

So from the AI’s aspect of this, how are you doing training and if it misidentifies something or it doesn't know what something is, does the image trying to get it back and classified or how is this system learning?

One of the greatest mediums like an image for Australia and it would basically collect a huge dataset based on certain words, crawled on the internet and then it would get labelled. What we're trying to do is, we're trying to look at what we are really unsure about. What we think we are not really sure about and basically looking at those items, so this is the work of active learning. You are actively learning those items that you don't think you have enough confidence on, you didn't present it to the agent. Obviously, be very sure that there's nothing private in that particular image. And then the agent—we play a game with them.

So one of the good example is, you show an image and the agent then tries to guess a word. We did have another agent or two or three agents trying to get different words which might describe the given image. And the AI is also trying to get some of the words and then we try to see how much of them are matching and some of those things that didn't match. And so when we start going tens, thousands, hundred thousands, millions of examples, it starts to learn what it's missing.

But here's the real key that as a practitioner you start to learn, which maybe we don't get to learn if you're just a researcher who is working on a static data set. When ImageNet was built, it was built by collecting examples, but the data set had a lot of bias and the bias was of a human bias. So when you look at imaging it and you look at the data set—it has photographs of shoes that have close ups. The data set a lot of objects which are close ups.

Imagine you are trying to look for caterpillar. In the real world, caterpillar is really really really small but if you look at ImageNet, the area taken by the caterpillar is really big. So you notice this bias and for those reasons many of the things that ImageNet doesn't translate to the real world. So what we are really working on is the real world.

When we tried to apply… many of the big cloud providers are the state of our AI models, they started to fail because they were trained on those data sets like ImageNet or COCO, and we we're trying to work towards them all in the world badly lit. I mean, the world of a blind person who does not even know where they are looking towards is a messy messy world. So, for those examples we are hoping that by training with the things that you’re unsure about, we can start to get better and better over time and hopefully fill the needs of our users well.

Other than that, which I get the caterpillar and all that, what are some of the other challenges you're facing, and what take data or techniques will you use to overcome them?

Some quick examples are… a good way of making a real good products again is to pick one problem, solve it well, and then try to see how you can generalize it. So one example was, our users end up at an airport. They go to the baggage counter and they say, “Hey, if you see my baggage let me know.” We have agents who can see those bags but like, well do you describe the baggages? Do you just say it's a yellow bag, right? Like, maybe we could do it better. What we started to do is before—the user would actually send their bag, the blind user it would actually take four or five photographs of their bag from different angles or maybe like a 360 degree video of their bag and then we would put them in a drive online, part of our agent, and so the agent would then try to match that particular video of the bag with what’s in there.

So one example of something we are working towards is, learning a model based on the limited amount of data we have of how the bag looks. So when the bag is now coming in the conveyor belt, we start to alert our agent that, “Hey, there is something that looks like the bag that they use it had previously recorded.” Obviously, this bag is going to not look anything like the video that was originally recorded, but we are trying to basically learn again from very few images of a particular bag. How that bag would look like? So that's like one example.

Another example is, you're trying to go towards learning in the 3D world. So most of the data sets are the 2D world learning from images but we are trying to go towards learning from point clouds. You could imagine that your data world is really 3D at the camera takes a 2D representation of it. But if you could go back from 2D to the 3D world, could you learn what are the objects located in this 3D world and where is the 3D surface of those particular objects. If you're working in the world of augmented reality as an example, and we knew the surface of the 3D world, we could then know where is the correct location of something. Maybe if you knew where the sofa as you could actually help the surface of the sofa in 3D world you could make the blind user walk and sit on it nicely.

So we're trying to learn the 3D coordinates of this world in a large location like an airport. I think that is our north star that we can take the airport which is like a complex scenario and make that happen in the next two years. And hopefully bring those technologies that we have in the self driving world in the palm of a blind person to achieve more. I think that's like one of the big goals that we have and this can be achieved with both 3D understanding of the world, navigation in the real world and understanding of visual perception we already have.

What is the current price point right now?

This is actually a pretty interesting point. We had a subscription based service just like you have a cell phone and you put pay Verizon, AT&T money per month. We start out like a $29 plan and you buy some minutes, right? That's like the classical subscription based model. But we know that 70% of our user base I mean, 70% of the people at the blind community are unemployed. 60% of the students in this community are not going to graduate. These are like the shocking statistics. So what we have opened is a flip way of basically giving ourselves away for free. And the way is that we partner with institutions, we partner with locations.

So right now, about 25,000 locations around United States have free Aira locations which means that the moment you walk inside, an Aira it's available for free. As an example, when you walk into a University of California property like University of California, San Diego about 20,000 acres of it is a free Aira zone. So now, if you are a student you have the option of studying and having an agent available any time, any day for free there. If you walk inside a Wegeman’s or Walgreen’s, or Target, and you want to do shopping independently, you don't need to have a companion or ask for assistance because you can have Aira for free. When you walk into Heathrow International Airport or Las Vegas, again Aira is free.

So to a person who is blind, the opportunities just become open. Another really good example is we did a partnership with Intuit who sponsored small businesses. So because of the unemployment rate, many people in this community want to open businesses on their own. And so we opened it and we noticed that in the first two, three weeks we had close to 15 small businesses opened by blind entrepreneurs and are registered—things like they would go and take photographs of the things that they merchandise they want to sell and put it on Amazon or Etsy or maybe go to the tax website and self certify them.

Another great example was we opened the program for people for careers. Just like you have a gym membership, maybe your employers might give you Aira as an open benefit by looking. But if you're trying to go for an interview, we give you access for free. So we give a hundred subscriptions to students for free. And the dropout rate, which was 60%—is dramatically reduced. So we have students that are still continuing to study in this particular program and we can see that step by step we can hopefully reduce the dropout rate in education by one tenth and the unemployment rate in this community by one tenth and that's our real goal.

How many people are using the technology right now?

We have a few thousand subscribers right now. We have been in this space for about one and half years publicly. I mean, in the selling as a active subscription, but I think this is just increasing at an exponential rate. Every few months we are almost doubling the number of users. So really what we are seeing is by having this access available everywhere, step by step by partnering with leaders like Target, like Walgreens, like AT&T, I think this is opening up opportunities for blind people. And the more they get to know about it, they just fall in love with it.

Are there other applications of the technology you're building that benefit other groups potentially or is it highly specific technology?

I think our aim is focus towards the users who are blind but they might be starting to happen is, we started hearing about people who are not blind but starting to use our service organically and we started noticing why and it turns out that this was people who are 80 years plus, who are using the service just because they find it really helpful to have someone who they can call and solve day to day needs. And this might not just be vision related, you just need people to help you in… I'm sure you must have been getting calls from your parents or from someone who might not know how to technologically solve something. Examples like that, we started getting a lot.

You obviously find a lot of people interested in use cases either for services like field workers who are in industry, in factories, who you know ask us, “Hey, could we use Aira for training our people or maybe connecting them to experts?” So as a quick example because of the training we are trying to do, we’re trying to solve problems like if you call from the Denver Broncos stadium, we can connect you to someone who is better at sports. Whereas, if you're trying to call about makeup, asking for makeup advice we can autonomously connect you with an expert who is… empirically we have seen females do better at that. So connect with someone who is an expert on this.

So we noticed that when you start with disability as your first goal, you end up opening technology for many, many broader use cases. And just to add to this historically, this has happened that accessibility accelerates innovation. Back in the 1800s, a gentleman was trying to build a way for [his] blind lover to write letters legibly and he ended up inventing the typewriter. Similarly, Alexander Graham Bell, I think you probably know the story, was trying to build devices for his wife and his mother who were deaf. He ended up inventing the telephone.

Back in the 1970s Ray Kurzweil, the famous inventor, was sitting at an airplane and he was sitting next to a blind individual and the blind person asked him like, “What's the one thing in life that you really wanted?” And the blind individual said, “I want to read. I want to read books.” So Ray Kurzweil went back into his garage, worked on a machine which was this washing machine sized thing and that he ended up inventing. It's a reading machine and that ended up inventing a flatbed scanner, text to speech, OCR. When you have Siri talking to you, you can basically thank a blind person back in the 1970s. Accessibility has helped accelerate mainstream technology again and again. And we are also finding that by working on blindness which is the hardest thing to solve for. The uses of this technology can be so much more in future.

How many blind people are there in North America and in the whole world?

In the world, believe it or not there are close to 294 million blind user people around the world. Blind and low vision group. This is like almost the population of United States. This is not a niche. So in the United States I think it's 22 million plus. So it's a large population. So it's not something to ignore, I’d say.

What would be the strategy for getting the technology in the hands of less prosperous areas. Is it still via the cell phone?

That's a good question. So the cell phone is probably a lot more ubiquitous like in a cell phone app because people around the world even in developing countries have cell phones, an android or iPhone so getting them technology with the cell phone is easy, but for someone who wants a better user experience, variable devices are probably a better user experience way to go.

We also noticed that when you see the platform devices like Google Home or Alexa, they are actually going bigger and bigger. We can try to have a conversational interfaces to give them some information based on their surroundings. And having some camera in the vicinity. So that's a good place.

I should actually mention that something I unintentionally forgot is we also allow, with the help of team viewer, for agents to log on to a blind person's computer. So imagine a blind person that's trying to shop for clothes and they want an opinion, the agent can actually help them. We had a user who was trying to build, was trying to release music and it happened that the agent we connected to was good at graphics and photoshop. So the agent was able to help them on the computer or designed the graphics, so it's basically phone, glasses on your computer and hopefully maybe in future on devices like Alexa and on.

All right. Well it sounds like you're doing an amazing job at solving a very hard problem and I salute you for it. Do you feel like the technology is finally… like you couldn't have done this 15 years ago, right? Do you feel like we have the technology now to do it and we just need the will and it's all gonna happen now, or is there additional technology that needs to be developed?

I think six years ago we probably would not have been talking about this whole area. As researchers, we always see the promising trends, follow it and hopefully a few years down the line they become reality beyond just hype, and this whole idea of in the last six years as computation has gone up almost from 300 to 300,000 times where computation power available for doing the same experiments. How the time to train some of these networks has reduced from months to I mean, yesterday it was like four minutes to create these networks as more of the techniques have come to improve the learning.

We see the direction it has gone in. I should actually also give a shout out to organizations that are learning to go in the world of inclusiveness, improve diversity, understand the risks, and also offer money to researchers. For example, Google giving $25 million grant to AI for Good, Microsoft giving close to $115 million grant for AI for Good. XPRIZE giving $5 million and a total of like $150 million for Societal Impacts, Partnership for AI are giving $4.5 million. So if the researchers out there are listening, if there's one thing you should use here you have possible funding to use your talents to work in the area of social good and make an impact that could be felt for years.

All right. Well actually I think it's a great place to leave the conversation. It's fascinating what you're doing and I'm sure it's very very rewarding. I want to thank you for being on the show.

Thank you very much. It's a pleasure being here.

Leave a Reply

Your email address will not be published. Required fields are marked *