caption

Lesson 1: Practical Deep Learning for Coders 2022

Welcome to Practical Deep Learning for coders,
lesson one. This is version five of this course, and it's
the first new one we've done in two years. So, we've got a lot of cool things to cover! It's amazing how much has changed. Here is an xkcd from the end of 2015. Who here has seen xkcd comics before? …Pretty much everybody. Not surprising. So the basic joke here is… I'll let you read it, and then I'll come back
to it. So, it can be hard to tell what's easy and
what's nearly impossible, and in 2015 or at the end of 2015 the idea of checking whether
something is a photo of a bird was considered nearly impossible. So impossible, it was the basic idea of a
joke. Because everybody knows that that's nearly
impossible. We're now going to build exactly that system
for free in about two minutes! So, let's build an “is it a bird” system. So, we're going to use python, and I'm going
to run through this really quickly.

You're not expected to run through it with
me because we're going to come back to it. But let's go ahead and run that cell. Okay, so what we're doing is we're searching
DuckDuckgo for images of bird photos and we're just going to grab one and so here is the url of the bird that we
grabbed. Okay, we'll download it. Okay, so there it is. So we've grabbed a bird and so okay we've
now got something that can download pictures of birds. Now we're going to need to build a system
that can recognize things that are birds versus things that aren't birds, from photos. Now of course computers need numbers to work
with, but luckily images are made of numbers. I actually found this really nice website
called pixby where I can grab a bird, and if I wiggle over it (let's pick its beak)
you'll see here that that part of the beak was 251 brightness of red, 48 of green, and
21 of blue. So that's RGB. And so you can see as I wave around, those
colors are changing (those numbers).

And so this picture, the thing that we recognize
as a picture, is actually 256 x 171 x 3 numbers, between 0 and 255, representing the amount
of red, green and blue on each pixel. So that's going to be an input to our program,
that's going to try and figure out whether this is a picture of a bird or not. Okay, so let's go ahead and run this cell,
which is going to go through… (and I needed bird and non-bird but you can't really search
Google images or DuckDuckGo images for not a bird, it just doesn't work that way. So I just decided to use forest – I thought
okay pictures of forest versus pictures of birds sounds like a good starting point.) So I go through each of: forest, and bird.

And I search for forest photo and bird photo,
download images, and then resize them to be no bigger than 400 pixels on a side – just
because we don't need particularly big ones and it takes a surprisingly large amount of
time just for a computer to open an image. Okay, so we've now got 200 of each. I find when I download images I often get
a few broken ones and if you try and train a model with broken images it will not work. So here's something which just verifies each
image and unlinks – so deletes the ones that don't work Okay, so now we can create what's called a
data block. So after I run this cell   you'll see that I've
basically… I'll go through the details of this later,
but… a data block gives fast.ai (the library) all the information it needs to create a computer
vision model. And so in this case we're basically telling
it… get all the image files that we just downloaded.

And then we say show me a few up to six, and
let's see… yeah, so we've got some birds, forest, bird, bird, forest. Okay, so one of the nice things about doing
computer vision models is it's really easy to check your data because you can just look
at it – which is not the case for a lot of kinds of models. Okay, so we've now downloaded 200 pictures
of birds, 200 pictures of forests, so we'll now press run.

And this model is actually running on my laptop,
so this is not using a vast data center. It's running on my presentation laptop. And it's doing it at the same time as my laptop
is streaming video, which is possibly a bad idea. And so what it's going to do is it's going
to run through every photo out of those 400, and for the ones that are forest it's going
to learn a bit more about what forest looks like and for the ones that are bird it'll
learn a bit more about what bird looks like. So overall it took under 30 seconds, and believe
it or not, that's enough to finish doing the thing which was in that xkcd comic.

Let's check by passing in that bird that we
downloaded at the start. This is a bird. Probability it's a bird: 1.0000 (rounded to
the nearest four decimal places). So something pretty extraordinary has happened
since late 2015, which is literally something that has gone from so impossible it's a joke
to so easy that I can run it on my laptop computer in (I don't know how long it was)
about two minutes. And so hopefully that gives you a sense that
creating really interesting, you know real working programs with deep learning is something
that… doesn't take a lot of code, didn't take any math, didn't take more than my laptop
computer. It's pretty accessible in fact. So that's really what we're going to be learning
about over the next seven weeks.

So where have we got to now with deep learning? Well it moves so fast, but even in the last
few weeks we've taken it up another notch as a community. You might have seen that something called
DALLꞏEꞏ2 has been released which uses deep learning to generate new pictures. And I thought this was an amazing thing that
this guy nick did where he took his friends twitter bios and typed them into the DALLꞏEꞏ2
input and it generated these pictures.

So this guy's… he typed in 
commitment, sympathetic,  psychedelic, philosophical, and it generated these pictures. So I'll just show you a few of these. I'll let you read them… I love that. That one's pretty amazing I reckon! actually. I love this. Happy Sisyphus has actually got a happy rock
to move around. So this is like, um, yeah, I don't know. When I look at these I still get pretty blown
away that this is a computer algorithm using nothing but this text input to generate these
arbitrary pictures. In this case of fairly, you know, complex
and creative things. So the guy who made those points out, this
is like… he spends about two minutes or so, you know, creating each of these. Like he tries a few different prompts and
he tries a few different pictures, you know, and so he's given an example here of… like
when he types something into the system… like, here's an example of like 10 different
things he gets back when he puts in “expressive painting of a man shining rays of justice
and transparency, on a blue bird twitter logo.” So it's not just, you know, DALLꞏEꞏ2,
to be clear.

There's, you know, a lot of different systems
doing something like this now. There's something called MidJourney, which
this twitter account posted: a “female scientist with a laptop writing code, in a symbolic,
meaningful, and vibrant style.” This one here is “an HD photo of a rare
psychedelic pink elephant.” And this one I think is the second one here
(I never know how to actually pronounce this.) This one's pretty cool “a blind bat with
big sunglasses holding a walking stick in his hand.” And so when actual artists, you know, this
for example, this guy said he knows nothing about art, you know he's got no artistic talent,
it's just something you know, he threw together. This guy is an artist who actually writes
his own software based on deep learning and spends, you know, months on building stuff,
and as you can see, you can really take it to the next level.

It's been really great actually to see how
a lot of fast.ai alumni with backgrounds as artists have gone on to bring deep learning
and art together, and it's a very exciting direction. And it's not just images to be clear, you
know one of another interesting thing that's popped up in the last couple of weeks is google's
pathways language model which can take any arbitrary english as text, a question, and
can create an answer which not only answers the question but also explains its thinking
(whatever it means for a language model to be thinking.) One of the ones I found pretty amazing was
that it can explain a joke. I'll let you read this… So, this is actually a joke that probably
needs explanations for anybody who's not familiar with TPUs. So it, this model, just took the text as input
and created this text as output. And so you can see, you know, again, deep
learning models are doing things which I think very few, if any of us, would have believed
would be maybe possible to do by computers even in our lifetime This means that there is a lot of practical
and ethical considerations.

We will touch on them during this course but
can't possibly hope to do them justice. So I would certainly encourage you to check
out ethics.fast.ai to see our whole data ethics course, taught by my co-founder Dr Rachel
Thomas, which goes into these issues in a lot more detail. All right, so as well as being an AI researcher
at the University of Queensland and fast.ai, I am also a homeschooling primary school teacher
and for that reason I study education a lot.

One of the people who I love in education
is a guy named Dylan Williams and he has this great approach in his classrooms of figuring
out how his students are getting along, which is to put a coloured cup on their desk – green
to mean that they're doing fine, yellow cup to mean I'm not quite sure, and a red cup
to mean I have no idea what's going on. Now since most of you are watching this remotely
I can't look at your cups and I don't think anybody bought coloured cups with them today,
so instead we have an online version of this. So what I want you to do is go to   cups.fast.ai/fast
– that's cups.fast.ai/fast – and don't do this if you're, like, a fast.ai expert who's
done the course five times – because if you're following along that doesn't really mean much
obviously. This is really for people who are, you know,
not already fast.ai experts.

And so click one of these colored buttons. And what I will do, is I will go to the teacher
version and see what buttons you're pressing. All right! So, so far people are feeling we're not going
too fast on the whole. We've got one, nope not one, brief red. Okay! So, hey Nick, this url, the same thing with
teacher on the end, if you can you keep that open as well and let me know if it suddenly
gets covered in red.

If you are somebody who's red, I'm not going
to come to you now because there's not enough of you to stop the class. So it's up to you to ask on the forum or on
the youtube live chat, and there's a lot of folks luckily who will be able to help you,
I hope. All right! I wanted to do a big shout out to Radek. Radeck created cups.fast.ai for me. I said to him last week I need a way of seeing
coloured cups on the internet and he wrote it in one evening. And I also wanted to shout out that Radek
just announced today that he got a job at Nvidia AI and I wanted to say, you know, that
fast.ai alumni around the world very very frequently, like every day or two, email me
to say that they've got their dream job.

Yeah… If you're looking for inspiration on how to
get into the field I couldn't recommend nothing… nothing would be better than checking out
Radek's work. And he's actually written a book about his
journey. It's got a lot of tips in particular about
how to take advantage of fast.ai – make the most of these lessons. And so I would certainly… so check that out as well.

And if you're here live he's one of our TAs
as well so you can say hello to him afterwards. He looks exactly like this picture here. So I mentioned I spent a lot of time studying
education both for my home schooling duties and also for my courses, and you'll see that
there's something a bit different, very different, about this course… which is that we started
by training a model. We didn't start by doing an in-depth review
of linear algebra and calculus. That's because two of my favorite writers
and researchers on education Paul Lockhart and David Perkins, and many others talk about
how much better people learn when they learn with a context in place. So the way we learn math at school where we
do counting and then adding and then fractions and then decimals and then blah blah blah
and you know, 15 years later we start doing the really interesting stuff at grad school. That is not the way most people learn effectively.

The way most people learn effectively is from
the way we teach sports, for example, where we show you a whole game of sports. We show you how much fun it is. You go and start playing sports, simple versions
of them, you're not very good right and then you gradually put more and more pieces together. So that's how we do deep learning. You will go into as much depth as the most
sophisticated, technically detailed classes you'll find – later, right! But first you'll learn to be very very good
at actually building and deploying models.

And you will learn why and how things work
as you need, to get to the next level. For those of you that have spent a lot of
time in technical education (like if you've done a phd or something) will find this deeply
uncomfortable because you'll be wanting to understand why everything works from the start. Just do your best to go along with it. Those of you who haven't will find this very
natural. Oh! And this is Dylan Wiliam, who I mentioned
before – the guy who came up with the really cool cups things. There'll be a lot of tricks that have come
out of the educational research literature scattered through this course.

On the whole I won't call them out, they'll
just be there, but maybe from time to time we'll talk about them. All right! So before we start talking about how we actually
built that model and how it works, I guess I should convince you that I'm worth listening
to. I'll try to do that reasonably quickly, because
I don't like tooting my own horn, but I know it's important.

So the first thing I mentioned about me, is
that me and my friend Silvain wrote this extremely popular book “Deep Learning for Coders”
and that book is what this course is quite heavily based on. We're not going to be using any material from
the book directly, and you might be surprised by that, but the reason actually is that the
educational research literature shows that people learn things best when they hear the
same thing in multiple different ways. So I want you to read the book and you'll
also see the same information presented in a different way, in these videos. So on,e of the bits of homework after each
lesson will be to read a chapter of the book. A lot of people like the book.

Peter Norvig, Director of Research, loves
the book. In fact his ones here “one of the best sources
for a programmer to become proficient in deep learning.” Eric Topple loves the book. Hal Varian Emeritus Professor at Berkeley,
Chief Economist, Google, likes the book. Jerome Pecente who is the head of AI at Facebook
likes the book. A lot of people like the book, so hopefully
you'll find that you like this material as well. I've spent about 30 years of my life working
in and around machine learning including building a number of companies that relied on it. And became the highest ranked competitor in  the world on Kaggle in 
machine learning competitions.

My company Enlitic, which I founded, was the
first company to specialize in deep learning for medicine, and MIT voted it one of the
50 smartest companies in 2016, just above Facebook and Spacex. I started fast.ai with Rachel Thomas and that
was quite a few years ago now, but it's had a big impact on the world already, including
work we've done with our students, has been globally recognized, such as our win in the
DAWNBench competition which showed how we could train big neural networks faster than
anybody in the world, and cheaper than anybody in the world. And so that was a really big step in 2018,
which actually made a big difference. Google started using our special approaches
in their models. Nvidia started optimizing their stuff using
our approaches. So it made quite a big difference there. I'm the inventor of the ULMFiT algorithm which
according to the Transformers book was one of the two key foundations behind the modern
NLP revolution. This is the paper here. And actually, you know, interesting point
about that, it was actually invented for a fast.ai course.

So the first time it appeared was not actually
in the journal. It was actually in lesson four of the course,
I think, the 2016 course, if I remember correctly. And, you know, most importantly of course,
I've been teaching this course since Version One. And this is actually, I think, this is the
very first version of it (which even back then was getting hbr's attention) a lot of
people have been watching the course, and it's been, you know, really widely used. Youtube doesn't show likes anymore, so I have
to show you our likes for you. You know it's been amazing to see how many
alumni have gone from this to, you know, to really doing amazing things, you know. And so for example Andrej Karpathy told me
that at Tesla, I think he said, pretty much everybody who joins Tesla in AI is meant to
do this course.

I believe at OpenAI, they told me that all
the residents joining there first do this course. So this, you know this course, is really widely
used in industry and research for people, and they have a lot of success. Okay, so there's a bit of brief information
about why you should hopefully keep going with this. All right so let's get back to what's happened
here. Why are we able to create a bird recognizer
in a minute or two? And why couldn't we do it before? So I'm going to go back to 2012 and in 2012
this was how image recognition was done. This is the computational pathologist – it
was a project done at Stanford. A very successful, very famous project that
was looking at the five-year survival of breast cancer patients by looking at their histopathology
image slides. Now, so this is, like, what I would call a
classic machine learning approach.

And I spoke to the senior author of this,
Daphne Koller, and I asked her why they didn't use deep learning and she said “well it
just, you know, it wasn't really on the radar at that point.” So this is like a pre-deep-learning approach. And so the way they did this was they got
a big team of mathematicians and computer scientists and pathologists and so forth to
get together and build these ideas for features, like relationships between epithelial nuclear
neighbors.

Thousands and thousands actually they created
of features, and each one required a lot of expertise from a cross-disciplinary group
of experts at Stanford. So this project took years, and a lot of people,
and a lot of code, and a lot of math. And then once they had all these features
they then fed them into a machine learning model – in this case, logistic regression,
to predict survival. As I say it's very successful, right, but
it's not something that I could create for you in a minute at the start of a course.

Tthe difference with neural networks is neural
networks don't require us to build these features. They build them for us! And so what actually happened was, in I think
it was 2015, Matt Zeiler and Rob Fergus took a trained neural network and they looked inside
it to see what it had learned. So we don't give it features, we ask it to
learn features. So when Zeiler and Zeiler looked inside a
neural network, they looked at the actual weights in the model and they drew a picture
of them.

And this was nine of the sets of weights they
found. And this set of weights, for example, finds
diagonal edges. This set of weights finds 
yellow to blue gradients. And this set of weights finds red to green
gradients, and so forth, right. And then down here are examples of some bits
of photos which closely matched, for example, this feature detector. And deep learning, I mean, is deep because
we can then take these features and combine them to create more advanced features. So these are some layer two features. So there's a feature, for example, that finds
corners.

And a feature that finds curves. And a feature that finds circles. And here are some examples of bits of pictures
that the circle finder found. And so remember with a neural net which is
the basic function used in deep learning, we don't have to hand code any of these or
come up with any of these ideas. You just start with actually a random neural
network and your feed it examples and you have it learn to recognize things, and it
turns out that these are the things that it creates for itself. So you can then combine these features. And when you combine these features it creates
a feature detector, for example, that finds kind of repeating geometric shapes. And it creates a feature detector, for example,
that finds kind of frilly little things, which it looks like is finding the edges of flowers. And this feature detector here seems to be
finding words.

And so the deeper you get the more sophisticated
the features it can find are. And so you can imagine that trying to code
these things by hand would be, you know, insanely difficult, and you wouldn't know even what
to encode by hand, right! So what we're going to learn is how neural
networks do this automatically, right, but this is the key difference of why we can now
do things that previously we just didn't even conceive of as possible, because now we don't
have to hand code the features we look for. They can all be learned. it's important to recognize we're going to
be spending some time learning about building image based algorithms and image-based algorithms
are not just for images and in fact this is going to be a general theme. We're going to show you some foundational
techniques but with creativity these foundational techniques can be used very widely. So for example, an image recognizer can also
be used to classify sounds. So this was an example from one of our students
who posted on the forum and said for their project they would try classifying sounds
and so they basically took sounds and created pictures from their waveforms and then they
used an image recognizer on that and they got a state-of-the-art result by the way.

Another of our students on the forum said
that they did something very similar to take time series and turn them into pictures and
then use image classifiers. Another of our students created pictures from
mouse movements from… from users of a computer system. So the clicks became dots and the movements
became lines and the speed of the movement became colors and then used that to create
an image classifier. So you can see with… with some creativity
there's a lot of things you can do with images. There's something else I wanted to point out
which is that as you saw when we trained a real working bird recognizer image model we
didn't need lots of math; there wasn't any. We didn't need lots of data. We had 200 pictures. We didn't need lots of expensive computers;
we just used my laptop. This is generally the case for the vast majority
of deep learning that you'll need in… in real life. There will be some math that pops up during
this course but we will teach it to you as needed or we'll refer you to external resources
as…

As needed but it'll just be the little bits that you actually need. You know the myth that deep learning needs
lots of data, I think, is mainly passed along by big companies that want to sell you computers
to store lots of data and to process it. We find that most real world projects don't
need extraordinary amounts of data at all and as you'll see there's actually a lot of
fantastic places you can do state-of-the-art work for free nowadays which is…

Which is
great news. One of the key reasons for this is because
of something called transfer learning which we'll be learning about a lot during this
course and it's something which very few people are aware of the pay-off. In this course we'll be using Pytorch. For those of you who are not particularly
close to the deep learning world, you might have heard of Tensorflow and not of Pytorch. You might be surprised to hear that Tensorflow
has been dying in popularity in recent years and Pytorch is actually growing rapidly and
in… in research repositories amongst the top papers, Tensorflow is a tiny minority
now compared to Pytorch. This is also great research that's come out
from Ryan O'Connor. He also discovered that… the majority of people that were doing Tensorflow
in 2018 researchers, the majority have now shifted to Pytorch and I mention this because
what people use in research is a very strong leading indicator of what's going to happen
in industry because this is where you know all the new algorithms are going to come out.

This is where all the papers are going to
be written about. it's going to be increasingly difficult to
use Tensorflow. We've been using Pytorch since before it came
out, before the initial release because we knew just from technical fundamentals, it
was far better. So this course has been using Pytorch for
a long time. I will say however that Pytorch requires a
lot of hairy code for relatively simple things. This is the code required 
to implement a particular  optimizer called AdamW in plain Pytorch. I actually copied this code from the Pytorch
repository so as you can see there's a lot of it. This gray bit here is the code required to
do the same thing with fast.ai.

Fast.ai is a library we built on top of Pytorch. This huge difference is not because Pytorch
is bad. It's because Pytorch is designed to be a strong
foundation to build things on top of, like fast.ai. So… When you use fast.ai – the library, you get
access to all the power of Pytorch as well but you shouldn't be writing all this code
if you only need to write this much code, right? The problem of writing lots of code is that
that's lots of things to make mistakes with, lots of things to, you know, not have best
practices in, lots of things to maintain. In general we've found, particularly with
deep learning: less code is better. Particularly with fastai, the code you don't
write is code that we've basically found kind of best practices for you. So when you use the code that we've provided
for you, you know you'll generally find you get better results.

So… so fast.ai has been a really popular
library and it's very widely used in industry, in academia, and in teaching and as we go
through this course we'll be seeing more and more pure Pytorch as we get deeper and deeper
underneath to see exactly how things work. The fast.ai library just won the 2020 best
paper award for the paper about it in Information so again you can see it's a very well regarded
library. Okay so… Okay we're still green, that's good.

So you may have noticed something interesting,
which is that I'm actually running code in these slides. That's because these slides are not in PowerPoint. These slides are in a Jupyter notebook. Jupyter notebook is the environment in which
you will be doing most of your computing. It's a web-based application which is extremely
popular and widely used in industry and in academia and in teaching and it is a very
very very powerful way to to experiment and explore and to build. Nowadays I would say most people at least
most students run jupyter notebooks not on their own computers particularly for data
science but on a cloud server of which there's quite a few and as I mentioned earlier if
you go to course.fast.ai you can see how to use various different cloud servers.

One I'm going to show an example of is Kaggle. So Kaggle doesn't just have competitions but
it also has a cloud notebook server and I've got quite a few examples there. So let me give you a quick example of how
we use Jupyter notebooks. To… to… to build stuff, to… to experiment,
to explore. So on kaggle, if you start with somebody else's
notebook… so why don't you start with this one Jupyter
notebook 101.

If it's your own notebook you'll see a button
called edit. If it's somebody else's, that button will
say copy and edit. If you use somebody's notebook that you like,
make sure you click the upvote button to encourage them and to help other people find it before
you go ahead and copy and edit. And once we're in edit mode we can now use
this notebook and to use it we can type in any arbitrary expression in python and click
run and the very first time we do that it says session is starting it's basically launching
a virtual computer for us to run our code.

This is all free. In a sense, it's like the world's most powerful
calculator. It's a calculator where you have all of the
capabilities of the world's I think most popular programming language – certainly it and
javascript would be the top two – directly at your disposal. So, python does know how to do one plus one,
and so you can see here it spits out the answer. I hate clicking I always use keyboard shortcuts
so instead of clicking this little arrow, you just press shift enter to do the same
thing but as you can see there's not just calculations here, there's also prose, and
so jupyter notebooks are great for explaining to you the version of yourself in six months
time, what on earth you were doing, or to your co-workers, or the people in the open
source community, or the people you're blogging for etc, and so you just type prose, and as
you can see when we create a new cell, you can create a code cell which is a cell that
lets you type calculations, or a markdown cell which is a cell that lets you create
prose.

And the prose uses formatting in a little
mini language called “markdown”. There's so many tutorials around I won't explain
it to you but it lets you do things like links and so forth. So I'll let you follow through the tutorial
in your own time because it really explains to you what to do. One thing to point out is that sometimes you'll
see me use cells with an exclamation mark at the start. That's not Python, that's a bash shell command,
okay? so that's what the exclamation mark means. As you can see you can put images into notebooks,
and so the image I popped in here was the one showing that Jupyter won the 2017 software
system award which is pretty much the biggest award there is for this kind of software. Okay, so that's the basic idea of how we use
notebooks. So let's have a look at how we do our bird
or not bird model. One thing I always like to do when I'm using
something like colab or kaggle cloud, cloud platforms that I'm not controlling is, make
sure that I'm using the most recent version of any software.

So my first cell here is exclamation mark
pip install minus u , (that means upgrade) q (for quiet) fast.ai. So that makes sure that we have the latest
version of fast.ai, and if you always have that at the start of your notebooks you're
never going to have those awkward forum threads where you say “why isn't this working?” and somebody says to you “oh you're using
an old version of some software!” So, you'll see here this notebook is the exact
thing that I was showing you at the start of this lesson, So, if you haven't done much python, you might
be surprised about how little code there is here, and so python is a concise but not too
concise language.

Leave a Reply