Forward Thinking On Artificial Intelligence With Microsoft Cto Kevin Scott

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Forward Thinking on

artificial intelligence with


Microsoft CTO Kevin Scott

How could AI help create jobs even in rural areas, and what would
it take? In this podcast episode, Kevin Scott shares his ideas with
James Manyika.

© Microsoft

June 2021
McKinsey Global Institute

In this episode of the McKinsey Global Michael Chui: We’ve been hearing for a long time
Institute’s Forward Thinking podcast, MGI’s that robots are coming for our jobs. Now, with
James Manyika explores the implications of widespread global unemployment due to COVID-19,
artificial general intelligence (AGI) for jobs, that sounds even more ominous. But what if robots
particularly in rural America, with Kevin Scott, and AI could, in fact, help with recovery?
Microsoft’s chief technology officer and author of
Well, it’s possible. For instance, in some rural parts
Reprogramming the American Dream: From Rural
of the US, artificial intelligence and machine learning
America to Silicon Valley—Making AI Serve Us All
are making these regions more economically viable.
(HarperCollins, 2020).
One of the big topics we analyze at the McKinsey
An edited transcript of this episode follows.
Global Institute (MGI) is artificial intelligence, how it’s
Subscribe to the series on Apple Podcasts, Google
impacting work, and what that means for society.
Podcasts, Spotify, Stitcher, or wherever you get
your podcasts. In this episode of Forward Thinking, we’ll hear
an interview with one of the leading technology
strategists in the world: Kevin Scott. Kevin is

In his book, Kevin Scott explains how AI can help create new opportunities in rural America. (Photo: HarperCollins)

2 Forward Thinking on artificial intelligence with Microsoft CTO Kevin Scott


McKinsey Global Institute

the chief technology officer and vice president of Kevin Scott: Around 2012 or so, the big revolution
artificial intelligence and research at Microsoft. He began happening with deep neural networks
also has a new book out called Reprogramming in machine learning, and these models doing
the American Dream. supervised learning have been able to accomplish
a lot in speech recognition and computer vision
The interview is conducted by MGI’s own James
and a whole bunch of these perceptual domains.
Manyika, who is a co-chairman and director of
We very quickly went from a plateau that we had
the McKinsey Global Institute, and a senior partner
hit with the prior set of techniques to new levels,
at McKinsey & Company. He’s also a deep expert in
that in many cases approximate or exceed human
his own right when it comes to artificial intelligence
performance at the equivalent task.
and machine learning. That’s why James sat down
with Kevin to discuss how AI might be the key to The challenge with supervised learning is that you
democratizing technology to work better for all of us. need a lot of data and a lot of computational power.
It’s a fascinating conversation. See what you think. And the data that you train on is labeled. With self-
supervised models, you also are training on huge
Kevin Scott: Thank you so much for having
amounts of data. And you need enormous amounts
me, James.
of compute. But you require very little or, in some
James Manyika: Delighted to have you. I’ve been cases, no supervision. No labels, no examples, so
looking forward to this conversation for some to speak, to tell the model what it is that you want it
time. We’re going to spend a fair amount of time to do. The models are training over these enormous
discussing your book, but first I wanted to talk amounts of data to learn general representations
about what you’re working on right now. You’re of a particular domain. Then you use them to solve
building some of the largest, most complicated a whole variety of problems in that domain. This has
computer systems in the world. And much of that absolutely transformed natural language processing
is being applied to AI systems. What are you most over the past couple of years.
excited about?
James Manyika: I remember last year when you
Kevin Scott: There are many things that we’ve published your results with Turing and NLG (Natural
been working on for the past couple of years that language generation) that was quite impressive.
I’m excited about, including these large-scale
Kevin Scott: The really interesting thing is that you
computing platforms for training a new type of deep
don’t have the constraint of having to supply these
neural network model. Some people called them
models with large numbers of labeled training data
unsupervised—we’ve taken to calling them self-
points or examples. You really are in this mode where
supervised learning systems.
the models are scaling up, mostly as a function of
It’s been really thrilling to build all of the systems the compute power you can apply to them.
infrastructure to support these training
James Manyika: When we start training systems
computations that are absolutely enormous, and
with these very large-scale models, does that help
to see the progress made on these large self-
us with another problem—the possibility of transfer
supervised models and with deep reinforcement
learning? Because that hopefully will obviate
learning. I never thought we would get to some
the need to train systems every single time.
of the milestones that we’ve been able to hit over
the past couple of years. Kevin Scott: The exciting thing that we’re seeing
with these big self-supervised models is that
James Manyika: Well, Kevin, as always, you’ve
transfer learning does work. That means that you
already said a lot that’s interesting in your opening
can train a model on a general set of data and then
remarks. Say more about supervised learning for
deploy it to solve a whole variety of different tasks.
a moment. This is the idea that the AI techniques
that we use today mostly learn from examples that James Manyika: That also takes us down the path of
we give them. You’re talking about going beyond potentially democratizing technology access.
that to self-supervised or unsupervised systems.
Why is that such a big shift?

Forward Thinking on artificial intelligence with Microsoft CTO Kevin Scott 3


McKinsey Global Institute

Kevin Scott: That’s one of the things that I wrote and integral to the research enterprise that places
about in my book, and it’s perhaps the primary like the Broad are doing. But before we leave
thing that gets me out of bed every morning and the question of AI and computing, I have to ask this
makes me excited to go to work. I really do believe question. Where do you think we are on this path
that when we’re thinking about technology, we towards AGI? And here I’m not really asking you to
should always be thinking about what platforms make a prediction, because I know that’s difficult.
we can create that empower other people to solve But I’m curious to hear where you think we are
the problems that they are seeing, and to help them on that journey and what you believe are some of
achieve what they want to achieve. the big problems we have to solve before we can
even get to AGI.
It can’t only be a small handful of large companies, or
companies that are only located in urban innovation Kevin Scott: Ever since the Dartmouth workshop
centers, that are able to make full use of the tech in 1956, where they invented the name for the field,
that we’re developing to solve problems. It really the whole history of AI has been about attempting
does have to become democratized. What I’ve been to create AGI. That started back in 1956, when
telling folks as I’ve talked about the book is that in the luminaries of the AI discipline of computer
2003 or 2004, when I wrote my first real machine science met. At that time, they laid out this road map
learning system, you really did need a graduate where they were trying to build software that could
degree in an analytical discipline. You would sit emulate, in a very general way, human intelligence.
down with these daunting graduate-level textbooks
That has proven to be a very difficult task. It’s
and stacks of papers. You had to be relatively
unclear exactly how many problems of intelligence
mathematically sophisticated in order to understand
you can solve with more data and more compute,
how the systems worked. Then you would spend
which I think is one of the reasons why it’s tricky
a whole bunch of time and energy writing lots of
to make accurate predictions when you get to
low-level code to do something. I went through this
general intelligence.
process for that first system I wrote, and it took
about six months of coding to solve the problem that Every time that we have used AI to solve a problem
I was working on. that we thought was some high watermark of human
intelligence, we have changed our mind about how
Fast-forward 14 years to 2020. Because of open-
important that watermark was. When we were
source software, because we’ve thought about how
both much younger, when I was in graduate school,
to solve these problems in a platform-oriented way,
the problem trying to be addressed was whether we
because we have cloud computing infrastructure
build a computer, an AI, that can beat a grand master
that makes training power accessible to everyone,
at chess.
and because you have things like YouTube and
online training materials that help you more quickly Turns out the answer to that was yes. And we did
understand how all of these framework pieces it. There was a whole bunch of fanfare at that time.
work, my guess is that a motivated high school I think it helped us a little bit. It’s shone a light on
student could solve the same problem in a weekend, how you could advance a particular part of artificial
whereas that took me six months over 14 years intelligence. But it certainly didn’t mean that
ago. That really is a tangible testament to how far the machines were suddenly taking over the role
these tools have already become democratized. All that human beings played.
the indicators point to the fact that they’re going to
It really hasn’t even made a material dent in chess,
become further democratized over the next handful
other than some of the techniques that we built
of years.
in our AI are now used to help humans practice to
James Manyika: Kevin, as you know, I’m involved become better human chess players. That’s actually
with the Broad Institute, which is one of the leading the thing we care about: humans playing humans
genomics research institutes in the world. And at chess.
today, fully a third of the people there are AI
James Manyika: Exactly. I remember when I was
computational people. That’s become necessary
a grad student researching and studying AI, we

4 Forward Thinking on artificial intelligence with Microsoft CTO Kevin Scott


McKinsey Global Institute

used to think about Turing tests and all kinds of One of the first things I did was to go back home and
tests. But we keep moving the goalposts, as you chat with some people I grew up with. And I had this
said. Every time we solve a problem, we shift what “aha” moment, almost the second that I set foot in
we think the watermark really is. Which leads me some of these places that my friends were working.
to the subject matter and the topics you address I was just reminded that these are some of the most
in your book. One of the things I loved about your ingenious and industrious people that I know.
book is that it also gave me a window into you,
They were already running businesses. They
Kevin, because I’ve always thought about you as
had pivoted with all the twists and turns that
a technologist building these very large-scale
the economy had thrown at them and built
systems. But you grew up in a place that most
businesses that were already using the most
people don’t associate with technology. You grew up
advanced technology that they could lay their
in a small town in Virginia. Say more about that.
hands on.
Kevin Scott: I grew up in this small town in rural
What’s more, I believe that the machine learning
central Virginia called Gladys. I’m fairly certain
systems are going to get exposed to entrepreneurs
that there are more cows in Gladys than there are
who are in these communities in more concrete
residents. There’s one state route that runs through
ways, so that they can build even more
the town. There isn’t even a stoplight to slow traffic
ambitious things.
down as it flows through. It’s a farming community. It
is not a place that was associated with technology. If they didn’t have this technology, these
businesses wouldn’t exist. The reason that they
Neither of my parents had gone to college. I was
are competitive in this fierce global market for
the first person in my family to get a four-year
manufacturing is because the automation that they
college degree. I grew up in the ’70s and ’80s. When
can leverage is just as efficient no matter where it’s
I was born, there was no personal computer.
running geographically.
There’s a lot of good luck that I benefited from in
They created a whole bunch of high-skilled jobs
my career and in my life. I was between ten and
in this tiny little community in central Virginia that
12 years old when the personal computer really
wouldn’t exist otherwise. With more high-skilled
started to hit—remember the Commodore C64,
jobs, they create this beneficial effect inside of
the RadioShack color computers, and the TRS-80s
the community. Enrico Moretti wrote this wonderful
and the Apple IIe? You had computers on the shelves
book called The New Geography of Jobs. Enrico
in department stores. And I was just completely
is an economist at the University of California,
fascinated by these things. I wanted to understand
Berkeley. In his research, he posited that a single
how they worked.
high-skilled job can create five lower-skilled jobs
James Manyika: That’s fascinating. I remember inside of the community where the high-skilled job
I had the Sinclair Spectrum. I don’t know if you’re is created. And you can see that economic effect in
familiar with it—we had it in the UK and other parts some of these places.
of the world. One of claims that you make in your
James Manyika: I guess the question in my mind is,
book that is very exciting is this idea that technology,
why don’t we see this happen in more places?
and AI in particular, can bring prosperity to all parts
of America, including rural America. Tell me more Kevin Scott: Well, the thing that I saw—and this
about why you think that’s possible. is, granted, just anecdotes—is that it’s happening
in more places than I thought. As soon as I saw
Kevin Scott: When I started writing the book, I had
this pattern, I thought, wow, where else might this
been doing machine learning for so long in a bunch
be happening?
of different contexts. I had been living in Silicon
Valley and working in the technology industry for You can see it at scale in Germany with
such a long time that I really had this idea in my head the Mittelstand, which typifies this model of
that maybe these technologies weren’t going to combining the high-skill, highly trained labor
benefit people in rural America. and augmenting them with really sophisticated

Forward Thinking on artificial intelligence with Microsoft CTO Kevin Scott 5


McKinsey Global Institute

technology, whether it’s a manufacturing business ambitious than something like a DARPA Grand
or a services business, or whatever. Challenge. And I don’t think it’s an either/or. Maybe
you want to do a bunch of these things.
You have lots of these businesses in Germany
creating lots of economic output. In some ways, We ran the Apollo program in the in the ’60s not
I think the Mittelstand is this pillar of the German because there was anything especially necessary
economy. And when I started looking here in about putting a human being on the moon, but
the United States, I saw more of these sorts of because solving that problem was a great way to
businesses than I expected. focus human ingenuity at a massive scale on a set of
technologies that turned out to be very beneficial.
One of the challenges for getting these instances
For example, our modern aerospace industry came
running in communities is partially about capital
out of the Apollo program.
allocation. Do the entrepreneurs in these
communities have reasonable access to venture I think we could pick a challenge like healthcare.
capital, so that they can try out their most We could say, enough is enough, it’s time that every
ambitious ideas? human being on the planet has access to high-
quality, low-cost healthcare. And here’s this list of
Then you have this basic stuff that’s just shameful
diseases and conditions that we want to radically
that it isn’t already solved, like access to broadband,
transform. Let’s cure, eliminate, minimize the impact
or the vocational education required to ensure
and suffering from these things and spend at
people can use these tools effectively to do the work
the same level of the Apollo program. Maybe you
of the future.
don’t even need to spend that much.
James Manyika: Think about all the examples
It’s not a huge amount. It’s about 2 percent of
you’ve got in your book and coupling those
GDP for a handful of years. I think you really could
together with the moment that we’re in. We’re now
transform not just human well-being through
in this extraordinary moment where it feels like
the end product of what you’re building. The process
the economy fell out from underneath us.
of solving the problem could put into place this
At the same time, there’s all this transformation infrastructure that could also define entire new
that’s required. What do you think we could do, sectors of the industry and our economic outputs for
and America could do, in this moment in order to decades ahead.
capitalize on the ideas you’ve got in the book, and
James Manyika: I couldn’t agree more. It’s quite
also use this galvanizing moment to reimagine what
sobering to remember, to your point, Kevin, that
the future might be? Any thoughts?
the peak of investment in overall federal R&D
Kevin Scott: One of the things that I wrote about in funding as a percentage of GDP was in 1964. That’s
the book is this idea that government investment when it was about close to 2 percent. It stayed there
can be a good catalyst for innovation. Think about for a while. Today it’s now dropped to about, at least
the self-driving industry, for instance, and all these in the US, to about 0.6 percent. I think there’s a role,
autonomous vehicles. I would argue that there as you said, for what reallocation in investment could
is a primary reason that these ingenious people do to drive the change that we’re talking about.
decided when they were graduate students at
But let me ask this. There’s clearly so much that
Stanford and Carnegie Mellon to focus on solving
society and the economy could benefit from
that problem of how do you get a vehicle to be able
democratizing technology and innovation and
to drive itself. At the time there were these DARPA
doing it at a very large scale. But there’s also always
Grand Challenge problems. You had funding that
the concern about potential misuse, misapplication,
was going to graduate schools and a prize at these
or risks associated with technology. How do
milestones toward solving this problem.
you think about that question? How can we be
I think that could be applied in tangible ways to more thoughtful and ensure we don’t misapply
helping solve some of the big challenges that we, these technologies?
as a society, face. You could even go much more

6 Forward Thinking on artificial intelligence with Microsoft CTO Kevin Scott


McKinsey Global Institute

Kevin Scott: One of the ways that I think about it these issues. That it’s going to take concerted
is that, as we invented software engineering as effort by the engineers, the scientists, the ethicists,
a discipline over the course of the past 60 years or a whole range of people thinking together about
so, we realized that finding all the bugs in software how to make sure these systems are safe. That we
is hard. We built a whole bunch of practices to try get all the benefits that we should be getting out
to catch the most common type of software bugs. of them.
We created a set of techniques to help us mitigate
Kevin Scott: Absolutely.
the impact that the bugs that slip through will have.
James Manyika: Kevin, before we go, I wanted
We’re going to have to build a similar set of things
to hear more about the smaller-scale AI projects
for machine learning models and AI. To give you
you’re working on. I’ve heard something about an AI
a few examples of what we’re thinking about, we at
coffeemaker. Does it know how much caffeine you
Microsoft have a sensitive uses practice for AI now.
need, by the way?
Anybody building a machine learning model that’s
going to be used in a product that the company has Kevin Scott: I think that’s an incomputable thing!
a set of guidelines that define what is or what isn’t I need lots of caffeine. I really can’t help myself.
a potential sensitive use of that technology. One of the ways that my curiosity manifests as I
try to understand the world is, like, I just want to
If it is a sensitive use, then it has to go through a very
build things. It’s something that I get from my from
rigorous review process to make sure that we are
my grandfathers and from my dad, who spent their
using the technology in ways that are making fair
entire lives making things with their hands, in their
and unbiased decisions, that the data that we’re
work and in their hobbies.
using to train the model and that the data that we’re
collecting as a byproduct of the use of the model I weirdly have this thing for coffee machines. I’ve
is treated in a proper way, preserving all of our built four of them in the past, and I’m working on one
covenants that we have with all of our stakeholders, right now, which is a vacuum siphon coffee machine
having the degree of transparency and control that that has an AI user interface. Which creates this
you need in the most sensitive situations. cognitive dissonance when I’m trying to slam these
two things together.
Bias in data is something we’ve talked a lot about
as a community over the past handful of years, and Vacuum siphon coffeemaking has been around
we now have tools that can detect when a data set since Victorian England, potentially longer than
has fundamental biases in it. We’re using GANs that. But they were very popular in Victorian society.
(generative adversarial networks), which are another And when you look at one of these things, it is
type of neural network, to generate synthetic data almost the definition of steampunk. I’m building this
to compensate for those biases, so that we can train steampunk coffee machine that has a modern user
on unbiased data sets even though we may not have interface on it. Instead of buttons and a screen, it
representative data that is naturally occurring in has a speaker, a camera, and a microphone. And
the data set to help us train the model in the way that so, when you approach the machine, it uses facial
we want. recognition to see whether it recognizes you. If
it doesn’t, it has a dialogue with you about how
It’s a whole spectrum of things, and I think
you want your coffee made. And then it offers to
the dialogue that we’ve got right now between
remember your preferences associated with your
stakeholders, people building these models, and
face. And if it recognizes your face because you’ve
people who are analyzing them and sort of pushing
given it your preferences before, it will ask you
us and advocating for accountability—all of that’s
“Kevin, would you like a cup of coffee?”
good. It’s a good thing that’s happening, and I’m
delighted that we have this ongoing debate. The funny thing about this machine is that the hard
part wasn’t the AI. All the speech recognition, face
James Manyika: One of the things I like about what
recognition, and the code that you need to store
you’re describing, Kevin, is that you’re emphasizing
preferences and whatnot is not that complicated,
the fact that there isn’t a silver bullet solution to
given open-source tools. The electronics required

Forward Thinking on artificial intelligence with Microsoft CTO Kevin Scott 7


McKinsey Global Institute

to run the stuff is a little bit pricier than you would electronics, which then makes it a very feasible way
want to use in a consumer-grade coffee device. to build a user interface for something. It’s coming.
The fascinating thing is the $30 worth of electronics
James Manyika: That’s very cool. Well, I can’t wait
it takes right now to run the AI part of the machine.
to try out the coffee. Hopefully, it’s actually good
Plus, the control loop for the rest of the device.
coffee too.
Like, the compute there is still one of the places
where Moore’s Law is working well. These low- Kevin Scott: We will see [laughs]!
cost microprocessors and microcontrollers are
James Manyika: Well, Kevin, I want to thank
on silicon fabrication technology that’s a few
you again so much for joining us. This was such
generations behind.
a pleasure for me and for hopefully the audience
And those cheap microcontrollers and that’s going to listen to this. I’m looking forward to
microprocessors are going to be getting much more catching up again and continuing our conversations.
capable over the coming years. I think we might get
Kevin Scott: Thank you so much for having me. And
to the point soon where you can do all of the stuff
as always, it’s a pleasure chatting with you.
that I’m doing for three bucks or a dollar’s worth of

James Manyika is a co-chairman and a director of the McKinsey Global Institute. Kevin Scott is chief technology officer at
Microsoft. Michael Chui is a partner of the McKinsey Global Institute.

Designed by the McKinsey Global Institute.


Copyright © 2021 McKinsey & Company. All rights reserved.

8 Forward Thinking on artificial intelligence with Microsoft CTO Kevin Scott

You might also like